When it comes to content delivery networks, there are a lot of words we use in the industry that are difficult to define. Words like performance, scalability and quality are used everyday as is the term the "edge". But depending on who you ask, definitions of what the "edge" is, and the role it plays in delivering video vary greatly.
For starters, the "edge" is really not a meaningful word if you are trying to define
how a CDN is architected and where it distributes traffic from. It has become a misused term that many of the CDNs use to indicate that traffic is coming from the closest location to the user. Just because the content may be coming from the closest location to the user does not guarantee quality. And in fact, many times, the content is not even being delivered from the closest location even though the CDNs says it is. You also have the "assumption" by many in the indsutry that CDNs cache all video or replicate content at every "edge" location they have, which is simply not the case.
Customers need to ask CDN providers where their servers are physically located that are distributing the specific content the customer is concerned with. You have to ask the CDNs where are you actually streaming that video from? In most cases, you can do simple trace routes to see for yourself. As an example, there was a lot of debate the past few weeks about the BBC’s iPlayer traffic and what impact that is having on ISPs. But if you do a trace route for iPlayer traffic today, you will find that a lot of it is coming from CDN servers outside the UK. Almost none of the traffic comes from “within” an ISP network, which is where most CDNs classify the "edge" to be. There are a couple of reason for this.
When moving small objects off a CDN, the latency associated with the distance from the CDN server to the consumer’s computer dominates the speed with which that image loads. As such that server needs to be placed as geographically close to the consumer as possible. Those images are tiny so those servers are configured with minimal storage. In addition you can afford to replicate those objects on many, many servers because the total storage costs are inexpensive. But comparing that to a large object like a video, the latency becomes irrelevant due to the overall time it will take to move the whole object. There is an impact on start time, but storage now becomes a much bigger cost.
CDN providers who originally built hugely distributed systems with little storage cannot make use of many of those previously deployed servers as they cannot store large libraries of video, in some cases not even more than a handful of videos. But, you wouldn’t want to in any case as you don’t want to replicate the video’s unnecessarily. A more centralized architecture with very large storage (only replicating for actual demand) is much more efficient. The number of locations in which you place servers is then mostly economically driven. It is a trade-off between storage and bandwidth consumption and it’s a balance based on how many objects you are distributing from the library and the popularity distribution through that library. While most CDN providers all talk about how "unique" their networks are, nearly every CDN has almost the same architecture for distributing large objects, whether cached or streamed.
Another reason almost none of the traffic comes from within an ISP network is DNS resolution. The ability of a CDN to localize traffic is somewhat limited by the resolution of the ISPs DNS. Some ISPs will not enable resolution beyond the whole ISP itself. So the whole issue of placing CDN servers within ISPs that cover large geographies networks becomes pointless.
In addition, CDN load balancing plays a big role. CDN providers determine where individual objects are served from based on many factors. The sophistication of the particular CDN’s algorithms will determine how many factors are taken into account. This is a real-time dynamic system in most cases and factors like performance of connected networks and performance of the CDN (load balancing due to demand) and costs to the CDN provider will be taken into account. This is fully under the control of the CDN provider and has nothing to do with the ISP. Even if an ISP houses a CDN server there is absolutely no guarantee it will actually be used. And as mentioned above, in relation to large objects and cost, it is most unlikely to be used.
One final point is that a CDN server placed inside an ISP network needs to be “filled”. The cache fill is data from the CDN’s origin (or the CDN’s customer’s origin). In 99% of cases this fill will come from outside the ISPs network. The cache hit ratio then should become a very important factor for the ISP. But how many think about that? The cache fill data plus the cost to house and power the CDN’s server is borne by the ISP in many cases. However, large object traffic, video, is what is causing cost increases for all ISPs. But if video is not being served from the CDN servers within the ISP network is there a real benefit to having them there?
The bottom line is that like many other topics in the content delivery market, people assume they know what terms means, how things work or more importantly the impact they think it is having on themselves, ISPs or other content owners. Content delivery networks as a whole need to do a much better job explaining exactly how they deliver video. Too many are so concerned with not giving out technical details, but it’s exactly what we need from the industry so we can educate customers and start to debate in an open manner how one network can operate more effectively than another. Over ten years later, there are still too many confusing questions about the content delivery business and trying to figure out how all of this works.
While I have a good understanding of the technology, I don’t pretend to be a network engineer who builds networks for a living. I’d like to see a really good discussion take place in the comments section with feedback from the CDNs directly as well as those who work at ISPs. This is a topic many want to know more about and one that many could benefit from with additional input.