Updated. Amazon Web Services (AWS), as the trailblazing provider of Infrastructure as a Service (IaaS), has changed the dialog about computing infrastructure. Today, instead of simply assuming that you’ll be buying and operating your own servers, storage and networking, AWS is always an option to consider, and for many new businesses, it’s simply the default choice.
I’m a huge fan of cloud computing in general and AWS in particular. But I’ve long had an instinct that the economics of the choice between self-hosted and cloud provider had more texture to it than the patently attractive sounding “10 cents an hour,” particularly as a function of demand distribution. As a case in point, Zynga has made it known that for economic reasons, they now use their own infrastructure for baseline loads and use Amazon for peaks and variable loads surrounding new game introductions.
An analysis of the load profiles
To tease out a more nuanced view of the economics, I’ve built a detailed Excel model that analyzes the relative costs and sensitivities of AWS versus self-hosted in the context of different load profiles. By “load profiles,” I mean the distribution of demand over the day/month as well as relative needs for bandwidth versus compute resources. The load profile is the key factor influencing the economic choice because it determines what resources are required and how heavily these resources are utilized.
The model provides a simple way to analyze various load profiles and allows one to skew the load between bandwidth-heavy, compute-heavy or any combination. In addition, the model presents the cost of operating 100 percent on AWS, 100 percent self-hosted as well as all hybrid mixes in between.
In a subsequent post, I will share the model and describe how you can use it for scenarios of interest to you. But for this post, I will outline some of the conclusions that I’ve derived from looking at many different scenarios. In most cases, the analysis illustrates why intuition is right (for example, that a highly variable compute load is a slam dunk for AWS). In other cases, certain high-sensitivity factors become evident and drive the economic answer. There are also cases where a hybrid infrastructure is at least worthy of consideration.
To frame an example analysis, here is the daily distribution of a typical Internet application. In the model, traffic distribution is an input from which bandwidth requirements are computed. The distribution over the day reflects the behavior of the user base (in this case, one with a high U.S. business-hour activity peak). Computing load is assumed to follow traffic according to a linear relationship, i.e. higher traffic implies higher compute load.
Note that while labor costs are included in the model, I am leaving them out of this example for simplicity. Because labor is a mostly fixed cost for each alternative, it will tend not to impact the relative comparison of the two alternatives. Rather, it will impact where the actual break-even point lies. If you use the model to examine your own situation, then of course I would recommend including the labor costs on each side.
For this example, to compute costs for Amazon, I have assumed Standard Extra Large instances and ELB load balancer for the Northern California region. The model computes the number of instances required for each hour of the day. Whenever the economics dictate it, the model applies as many AWS Reserved Instances (capacity contracts with lower variable costs) as justified and fills in with on-demand instances as required. Charges for data are computed according to the progressive pricing schedule that Amazon publishes. To compute costs for self-hosting, I assume co-location with the peak number of Std-XL-equivalent servers required, each loaded to no more than 80 percent of capacity. The costs of hardware are amortized over 36 months. Power is assumed to be included with rackspace fees. Bandwidth is assumed to be obtained on a 95th percentile price basis.
Now let’s look at a sensitivity analysis. Notice in the above example, that a bit more than half of the total cost for each alternative is for bandwidth/data transfer charges ($35,144 for self-hosted at $8/Mbps and $36,900 for AWS). This is important because while Amazon pricing is fixed and published, 95th percentile pricing is highly variable and competitive
The chart above shows total costs as a function of co-location bandwidth pricing. AWS costs are independent of this and thus flat. What this chart shows is that self-hosting costs less for any bandwidth pricing under about $9.50 per Mbps/Month. And if you can negotiate a price as low as $4, you’d be saving more than 40 percent to self-host. I’ll leave discussion of the hybrid to another post.
This should provide a bit of a feel for how I’ve been conducting these analyses. Above is a visual summary of how different scenarios tend to shake out. The intuitive conclusion that the more spiky the load, the better the economics of the AWS on-demand solution is confirmed. And similarly, the flatter or less variable the load distribution, the more self-hosting appears to make sense. And if you’ve got a situation that uses a lot of bandwidth, you need to look more closely at potential self-hosted savings that could be feasible with negotiated bandwidth reductions.
Update (Feb. 14): This post has garnered a lot of much appreciated attention. From the comments, I see that two clarifications would be helpful:
- The key point here is that a comparison of the cost of cloud hosting versus self-hosting needs to be based on the profile of your load. It is not that Amazon (or any other provider) is more expensive than self-hosting, as this is often not the case. Rather, it depends on the profile of your load. Moreover, it’s not so important where exactly your breakeven point is but rather it is most important to know the main sensitivities (e.g. bandwidth cost, CPU load, storage, etc.) for your situation so that you can understand which differences could flip the decision. The results here are for this example only and other examples will produce different results, some in favor of cloud and some in favor of self-hosting.
- The specific use case I’ve chosen is for a business that’s pretty far along. But some people have been wondering how this example applies to startups. That’s a great question.
While I’ve referred to “spiky” loads, there’s another way to say that which is “variable,” “unknown” or “unpredictable,” which describes the situation that a startup (or other new business endeavor) usually finds itself in. In those cases, the fact that you cannot forecast very well is a reason why it’s highly unlikely you’ll save money by self-hosting…because you’re very unlikely to buy the right amount of capacity. You’ll either overprovision and waste money on unused capacity, or you’ll buy too little and compromise the business. So while you might not call your startup load “spiky,” the fact that it’s unpredictable gives it a similar profile in the model and hence the economic conclusion would tell you to go with the cloud infrastructure route.
Another not-strictly-economic respect that needs to be considered for startups (and others) is the benefit of focusing one’s attention on primary value-creating activities versus commodity activities (relative to the business) that one might not be very good at anyway. In addition, AWS and other cloud providers give us the highly valuable ability to experiment with little downside. This is especially important for the highly iterative and trial-and-error nature of building successful Internet businesses.
The point of this particular example is that if you have a significant amount of load that is well known and predictable then you may be able to save some money by bringing a portion or all of that inside.
Charlie Oppenheimer is a serial-CEO and currently an executive-in-residence at venture-capital firm Matrix Partners. His most recent company, Digital Fountain, was acquired by Qualcomm, and his previous company, Aptivia, was acquired by Yahoo. He blogs at stratamotion.com.