Reviewing Fastly’s New Approach To Load Balancing In The Cloud

Load balancing in the cloud is nothing new. Akamai, Neustar, Dyn, and AWS have been offering DNS-based cloud load balancing for a long time. This cloud based approach has many benefits over more traditional appliance-based solutions, but there are still a number of shortcomings. Fastly’s recently released load balancer product takes an interesting new approach, which the company says actually addresses many of the original challenges.

But before delving into the merits of cloud-based load balancing, let’s take a quick look at the more traditional approach. The global and local load balancing market has long been dominated by appliance vendors like F5, Citrix, A10, and others. While addressing key technology requirements, appliances have several weaknesses. First, they are costly to maintain. You need specialized IT staff to configure and manage them, not to mention the extra space and power they take up in your datacenter. Then there are the high support costs, sometimes as much as 20-25% of the total yearly cost of the hardware. Appliance-based load balancers are also notoriously difficult to scale. You can’t just “turn on another appliance” based on flash traffic or a sudden surge in the popularity of your website. Finally, these solutions don’t fit into the growing cloud model which requires you to be able to load balance within and between Amazon’s AWS, Google Cloud or Microsoft Azure.

Cloud load balancers address the shortcomings of appliance-based solutions. However, the fact that they are built on top of DNS creates some new challenges. Let’s consider an example of a user who wants to connect to A DNS query is generated, and the DNS-based load balancing solution decides what region/location to send the query based on a few variables. The browser/end-user caches that information typically for a minute or more. There are two key problems with this approach, the DNS time-to-live/caching, and the minimal number of variables that the load balancer can use to make the optimal decision.

The first major flaw with this approach is the fact that DNS-based load balancing is dependent upon a mechanism that was designed to help with the performance of DNS. DNS has a built-in performance mechanism where the answer returned from a DNS question can be cached for a time period specified by the server. This is called the Time to Live or TTL, and usually the lowest value most sites use is between 30-60 seconds. However, most browsers have implemented their own caching layer that can override the TTL specified by the server. In fact, some browsers cache for 5-10 minutes, which is an eternity when a region or data center fails and you need to route end users to a different location. Granted, modern browsers have improved their response time as it relates to TTL, but there are a ton of older browsers and various libraries that still hold on to cached DNS responses for 10+ minutes.

The second major flaw with DNS-based load balancing solutions is that the load balancing provider can only make a decision based on the recursive IP of the querying DNS server, or less frequently (if the provider supports it), the end-user IP. Most frequently, DNS-based solutions receive a DNS query for and the load balancer looks at the IP address of the querying system, which is generally just the end user’s DNS resolver and is often not even in the same geography. The DNS-based load balancer has to make decisions based solely on this input. It doesn’t know anything about the specific request itself – e.g. the path requested, the type of content it is, whether the user is logged in or not, the particular cookie or header values, etc. It only sees the querying IP address and the hostname which severely limits its ability to make the best possible decision.

Fastly’s says their new application-aware load balancer is built-in such a way that it avoids these problems. It’s basically a SaaS service built on top of their 10+ Tbps platform, which already provides CDN, DDoS protection, and web application firewall (WAF). Fastly’s load balancer makes all of its load balancing decisions at the HTTP/HTTPS layer, so it can make application-specific decisions on every request, overcoming the two major flaws of the DNS-based solutions. Fastly also provides granular control, including the ability to make different load balancing decisions and ratios based on cookie values, headers, whether a user is logged in (and if they are a premium customer), what country they come from, etc. Decisions are also made on every request to the customer’s site or API, not just when the DNS cache expires. This allows for sub-second failover to a backup site if the primary is unavailable.

The other main difference is that Fastly’s Load Balancer, like the rest of their services, is developed on their single edge cloud platform, allowing customers to take advantage of all the benefits of this platform. For example, they can create a proactive feedback loop with real-time streaming logs to identify issues faster and instant configuration changes to address these issues. You can see more about what Fastly is doing with load balancing by checking out their recent video presentation from the CDN Summit last month.

When It Comes To Cache Hit Ratio And CDNs, The Devil Is In The Details

The term “cache hit ratio” is used so widely in the industry that it’s hard to tell what exactly it means anymore from a measurement standpoint, or the methodology behind how it’s measured. When Sandpiper Networks first invented the concept of a CDN (in 1996), and Akamai took it to the next level by distributing the caching proxy “Squid” on a network of global servers, the focus of that caching at the time was largely images. But now we need to ask ourselves if focusing on overall cache hit ratio as a success metric is the best way to measure performance on a CDN.

In the late 90’s, much of the Internet’s web applications were being served from enterprises with on premise data centers and generally over much lower bandwidth pipes. One of the core issues Akamai solved was relieving bandwidth constraints at localized enterprise data centers. Caching images was critical to moving bandwidth off the local networks and bringing content closer to the end user.

But fast forward 20 years later and the Internet of today is very different. Pipes are bigger, applications are more complicated and users are more demanding with respect to performance, availability and security of those applications. So, in this new Internet is the total cache hit ratio for an application a good enough metric to consider, or is there a devil in the details? Many CDNs boast of their customers achieving cache hit ratios around 90%, but what does that really mean and is it really an indicator of good performance?

To get into cache hit ratios we must think about the elements that make up a webpage. Every webpage delivered to a browser is comprised of an HTML document and then other assets including images, CSS files, JS files and Ajax calls.  HTTP Archive tells us that, on average, a web page contains about 104-108 objects per page coming from 19 different domains. The average breakdown of asset types served per webpage from all HTTP Archive sites tested looks like this:

Most of the assets being delivered per web page are static. On average 9 may specifically be content type HTML (and therefore potentially dynamic) but usually, only one will be the initial HTML document. An overall cache hit rate for all of these objects tells us what percentage of them are being served from the CDN, but does not give developers the details they need to truly optimize caching. A modern web application should have most of the images, CSS files and other static objects served from cache. Does a 90% cache hit ratio on the above page tell you enough about the performance and scalability of the application serving that page?  Not at all.

The performance and scalability of a modern web applications is often largely dependent on its ability to process and serve the HTML document.  The production of the HTML document is very often the largest consumer of compute resource on a web application. When more HTML documents are served from cache, less compute resource is consumed and therefore applications become more scalable.

HTML delivery time is also critical to page load time and start render time, being the first object delivered to the browser and a blocker to all other resources being delivered. Generally, serving HTML from cache can cut HTML delivery time to circa 100ms and significantly improve user experience and their perspective of page speed. Customers should seek to understand the cache hit ratio by asset type so developers can specifically target improvements in cache hit rates by asset type. This would result in achieving faster page load times and a more scalable application.

For example, seeking closer to 100% cache hit rates for CSS files, JS files and possibly images would seem appropriate.

As would understanding what cache hit rate is being achieved on the HTML.

[*Snapshots from the portal]

While not all HTML can be served from cache, the configurability of cache solutions like Varnish Cache (commercially available through Varnish Software, and Fastly) and improved HTML management options such as HTML streaming (commercially available from Instart Logic and have made it possible to cache HTML. In addition, new developer tools such as’s Developer PoP allow developers to more safely configure and deploy HTML caching without risking incidents in production.

Many CDNs focus on overall cache hit rate because they do not encourage their users to cache HTML. A 90% cache hit rate may sound high, but when you consider that the 10% of elements not cached are the most compute-heavy, a different picture emerges. By exposing the cache hit ratio by asset type, developers are able to see the full picture of their caching and optimize accordingly. This results in builders and managers of web applications who can more effectively understand and improve the performance, scalability, and user experience of their applications and is where the industry needs to head.

New Report Reveals Low TV Network Brand Recognition among Young Millennials, Here’s What it Means for Business

A new report from ANATOMY entitled, “The Young and the Brandless,”  ranks seven key TV and OTT networks according to their digital performance and brand recognition among young millennials (18-26). The report reveals what raises a TV network brand’s relevance in digital environments.The biggest difference between TV brands with high and low brand recognition is a user

The biggest difference between TV brands with high and low brand recognition is a user experience-first strategy, ANATOMY’s report suggests. But, according to ANATOMY CEO Gabriella Mirabelli, “While networks consistently indicate that the viewer is at the center of their thinking, they don’t seem to actually analyze how users truly behave.” User experience is the key to making a TV network relevant in digital spaces. TV brands with higher brand recognition among young millennials (e.g., Netflix) are extremely social media savvy. ANATOMY found that these brands know when to post and what to post on to drive higher rates of engagement. For example, according to ANATOMY, Facebook posts published between 12-3 PM generate “236% more engagements” (reactions, shares, and comments).

TV brands with higher brand recognition among young millennials (e.g., Netflix) are extremely social media savvy. ANATOMY found that these brands know when to post and what to post on to drive higher rates of engagement. For example, according to ANATOMY, Facebook posts published between 12-3 PM generate “236% more engagements” (reactions, shares, and comments).

A TV network’s website or app is also an important touchpoint for its brand in digital spaces, but people judge websites quickly — in about “3.42 seconds”, according to ANATOMY. TV networks with higher brand recognition had easy-to-user user interfaces on their websites and apps. They made it easy for people to watch shows, discover new content, and find information about shows.

There is a lot of other great data in the report, which you can download for free here.

Media and Web Performance CDN Pricing Survey: Raw Data Now Available

In April I completed my yearly pricing survey asking customers of third-party content delivery networks what they pay, which vendor(s) they use and how much their traffic is growing amongst a host of other questions. New this year I also collected data on web performance pricing. If you are interested in purchasing all of the raw data, minus the customer’s names, please contact me. (917-523-4562) The media CDN pricing raw data is from over 600 customers and the web performance based pricing data is from over 50 customers. I can also collect custom data as well, around third-party CDN services.

Conviva Raises 6th Round Of Funding Totaling $40M, Has Raised $112M To Date

This morning Conviva announced their sixth round of funding in the amount of $40M, with money coming from a new investor, Future Fund, along with several existing investors. To date Conviva has now raised a total of $112M. Conviva has been the longest operating vendor in the market offering content owners the ability to measure the QoE of their OTT offerings with the company saying they have 200 video publishing brands and service providers including the likes of HBO, SKY, and Turner.

While the company won’t disclose any revenue numbers, the number I keep hearing whispered in the industry is that Conviva did around $70M in revenue in 2016. I have no way to verify that, but a former Conviva employee told me they wanted to do $100M+ in revenue by 2017, which to me, seems aggressive.

Conviva published their latest viewing minutes numbers across all customers citing 80% growth last year to 1 billion minutes / day and expected growth of 150% for 2018. Their customer billing model is based on viewer hours so you can extrapolate that their revenue would grow with increased viewing time. The company claims to be growing faster than the overall OTT market that is estimated to be somewhere at around 20-30% CAGR.

Conviva told me that 3-5% of the total spend on traditional TV goes to measurement and analytics and they believe the total available market will be the same for OTT. Their rationale is that the added value of continuous, census-based measurement and analytics over the internet and a wide variety of consumer video devices is inherently more valuable than more traditional panel-based statistical approaches. This market has historically been relatively small, but is now getting much more competitive, so the success of these companies will be based on the market truly experiencing accelerated growth and the recent historic drop in traditional Pay-TV subscribers is a good leading indicator of that being the case.

Conviva said they raised this round to do strategic development on their AI for video platform and the sensor network they deploy across all their publisher’s customer viewing devices. Collection and basic measurement will feel downwards price pressure with competition and market maturity, so they feel this product vision will be key to their success. In speaking with the company I learned that they have very sophisticated AI or machine learning models that have been trained for years on their large customer base of video viewing data. They stressed the large and diverse data set derived from continuous real-time measurement of all metrics and metadata associated with every second of all video viewing sessions.

This is not just QoE data, but also engagement data, audience data, content metadata, infrastructure metadata, and more. The combination of this continuous and comprehensive data collection and AI purpose-built for video could be a very interesting formula to unleash enormous value for OTT businesses.