The great debate: Windows Azure vs. Amazon Web Services

The Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) markets are growing faster than ever. But sometimes it seems like startups base their infrastructure decisions on a popularity contest or random selection. Among the two gorillas in the cloud space, some developers swear by Amazon, others think Azure is the best, but often the details are sparse as to why one option is better than the other.

To find out how these two services measure up in the real world, two startup tech guys — Craig Knighton of LiquidSpace and Zach Richardson of Ravel Data — lay out the cases for their clouds of choice to see how the services compared in real-world use at living, breathing companies.

Knighton is in Windows Azure’s corner, while Richardson takes up the case for AWS.

Why is the choice between AWS and Microsoft? Can no other providers fit the bill?

Knighton: The real choice is between IaaS and PaaS. If we were interested in an IaaS cloud, we would have definitely considered AWS.

However, the real win for us was getting access to an environment that was as managed as possible, so we are spending our time and money focused on solving the business problem instead of creating and managing our various environments.

We did give some consideration to Google App Engine as a possible PaaS alternative but since time to market was important and we had .NET skills at the time, it didn’t make sense to gamble on Google App Engine technology and learning curve.

Richardson: I think that there are many other providers that can fit the bill, but most of them can’t provide the price points or size that Amazon can provide.

Building up the operational capability to provide a service like AWS or Azure is a difficult proposition.  Both AWS and Azure provide multiple locations and pay-by-the-hour capability. That’s actually really hard to do without massive capital behind it.

Isn’t Azure all about .NET? Does language support matter?

Knighton: Assuming that we are talking about a modern managed-code language like Java, C# or Ruby, then I agree that the language does not matter much for building a typical business application.

The language is only one part of the combination of runtime and supporting framework libraries and it is the working knowledge of that entire combination that determines the effectiveness of the team. If we were trying to solve a big compute problem where managed code was not going to work because we needed to write compiled code, then language would matter, but that also increases the average skillset you need in the team to tackle such problems.

Richardson: I absolutely think language support matters.

Getting stuck in a single framework like .NET where there is only one “provider” for .NET tools can be a huge hindrance in any future decisions you make as a company.

Microsoft (and Azure as default) seems to be all about lock-in.  Lock-in on the operating system, lock-in on the language platform, as well as lock-in on the Azure services. Also, many companies do have to solve big compute problems that Java, unlike .NET, is well positioned for. While many larger companies don’t have to be as concerned with lock-in — this is a very scary thought for most start-ups that need a clearer longer-term cost structure.

IaaS (AWS) vs. PaaS (Azure) why is one approach better than the other?

Knighton: Speaking in defense of PaaS, we really enjoy the reduced labor costs and faster response times associated with having a direct connection between development and our production environment, not to mention no mysterious “production operations” team responsible for the health of the OS, ping, power and pipe.

We have also found great goodness in the prescriptive limitations that you must build in an app that is “sessionless” and can be deployed to any number of nodes as needed to scale up the app. We all know that we should do it the right way, but having the platform drive you to do it makes my job as architect that much easier.

Richardson: This question seems to be a little misleading. AWS is special in that it provides both IaaS and PaaS.

AWS provides an entire suite of platform-based services that mirror the solutions provided by Azure. For example, Amazon’s Elastic MapReduce to Microsoft’s Dryad. Both have “scalable relational databases.” AWS provides elastic Tomcat deployment of webapps, as well as simpleDB. They both provide load balancing as a platform, blob storage, queue services. The list really goes on an on.

However, AWS doesn’t have the .NET support that Microsoft has. I think the big advantage from AWS is you can get the platform capabilities and the freedom to easily deploy brand new technologies before they become part of the platform. So I guess people need to decide how important cutting-edge is. In the case of a “simple business application” I guess it really isn’t that important.

Did the AWS outage put the fear of god in you? Is one service better architected than the other to avoid service outages, security breaches, etc.?

Knighton: I think it’s possible to build a sufficiently redundant failover strategy on either cloud with comparable overall uptime statistics, but it’s just a hell of a lot easier to do it on Azure.

Bragging that your cloud was not the most recent one to fail is just tempting the gods to smite you with the next unexpected physical disaster or limitation of the services on which you depend. Lots of app catastrophes occur because of poor application of the services used to create the architecture and not because of any fragility in the cloud itself. I would call this one a draw.

Richardson: I personally don’t know how well Azure’s failover strategies work. Both platforms advertise identical uptime. It appears that there are more AWS users, and far more consumer-web AWS users so, any small outage will get even more coverage. There are plenty of companies that used multiple AWS availability zones that weren’t affected.

I agree with Craig on this. One hundred percent uptime of an availability zone or data-center, whether it is Azure or AWS, is just not going to happen. Agreed, it’s a draw, folks.

Does either make it particularly easy for users to take proactive steps around high availability and security?

Knighton: As I said before, I think it is easier for the average engineer to accomplish high availability on Azure because the tools push you down that path and make it pretty darn easy to build stateless applications.

Richardson: I think there are more options for building an absolutely massive HA application on Amazon. And equally as easy for the “average engineer.” If you are a .NET engineer, I’m sure Azure will be easier.

When it comes to security, I have to admit I’m not sure how easy it is in Azure, but I do know it isn’t hard in Amazon.

Azure can’t really compete with AWS on performance, right (Cluster Compute, GPUs, etc.)?

Knighton: Performance is in the eye of the beholder. Are you trying to build a large scale compute farm that will spend the next three months rendering frames for next Toy Story sequel or are we running an above average mobile/web business application?

Compute power is great until you touch a data store and then the only thing that matters is how often you read and write. We’re plugged into the Azure AppFabric Caching feature and will be using it more extensively in upcoming releases of LiquidSpace where appropriate. But, we also are finding that our use of read-optimized de-normalized views gives us read response times that rival hitting a cache server anyway so, our use of it may not be as extensive as you would think.

Richardson: For pure performance, I highly doubt Azure can match AWS. At worst, I bet AWS and Azure are equal but in specialized applications I expect AWS will heavily out-perform Azure.

In the read/write case, it is far simpler to deploy super fast high-transaction data-stores like Apache Cassandra, or to use Amazon SimpleDB, which is built on top of Amazon’s Dynamo data store. In other words, you don’t have to setup your own NoSQL implementation based on CQRS and Azure blobs or tables.

Whose big data tools are better?

Knighton: A fair question, and unless our friends in Redmond are going to surprise us soon with a native equivalent to MapReduce or a NoSQL equivalent such as SimpleDB, RavenDB, or MongoDB then I would have to say that AWS has an advantage today.

Since we built out our own NoSQL implementation based on CQRS and Azure blobs and tables, we are on our way there, but the lack of a fast, horizontally scalable lookup service that is native to the architecture is something they need to fix.

There is an outlet valve though. It’s probably possible to stand up our own OS instances in Azure and to install and manage our own RavenDB or MongoDB implementation if push came to shove. The only issue is that with that, we would be crossing the line from PaaS to IaaS and weakening the value proposition of our cloud choice. Ultimately, the fewest moving parts that we have to maintain, the better.

Richardson: Elastic MapReduce, SimpleDB, S3, easy Hadoop or Cassandra installs using Cloudera and DataStax scripts.

Amazon does the big data thing really well. Extra-Large Memory instances, High-CPU instances, Cluster Compute instances. They can’t be beat right now. Amazon definitely provides greater variability for different big data applications. Every big data problem requires different underlying capabilities — this is definitely a case where a one-size-fits-all solution might not be a solution at all.

Given their respective histories, one as a bookseller and one as purveyor of the blue screen of death with an antitrust history, why should anyone trust either provider?

Knighton: Forget the history, good or bad.

Here’s what matters — if Amazon fails at this, they can shrug it off and go back to selling books. If Microsoft fails, they die a long, slow death. Where would you put your money? If you are in this for the long haul (and we are because we intend to be wildly successful) then you have to go with the best strategic alignment.

Richardson: Amazon is much more than a bookseller, they are innovators. Besides really creating the market for cloud and then the market for cloud-by-the-hour, Amazon has created an enormous amount of innovation in recommendation engines, horizontally scalable data-stores.

Neither Microsoft nor Amazon are going away anytime soon. I would make a decision based on one thing and one thing only — bleeding edge capability and lock-in. If you want to be able to adopt new capabilities fast and have the ability to move if necessary, AWS is the best bet.

Ultimately, what are the reasons for choosing AWS/Azure, and not choosing the other? Cost? Development tools? Ecosystem?

Knighton: Yes. All three. We know the development tools, we’ve gained confidence in the ecosystem and the reliability and scalability that comes with it, but most importantly, it is clear that the Azure play is a dramatic move for Microsoft away from their traditional server licensing models toward providing a platform dial tone on a monthly basis.

We had expected to go the open source software route before investigating Azure because my past experience was that it was just too expensive to rent or own the Microsoft OS to run on top of anyone’s IaaS offering even though it might have been our preference because it was easier to use and support. When we did a direct comparison of capabilities and equivalent costs we were quite pleased to find that Microsoft is pricing to match or beat the equivalent offering. With cost off the table, the decision was easy.

Richardson: I’m also going to say yes, all three. First off, people coming out of school don’t know .NET. They know Java, Python, Scala, or other non-Microsoft technologies. This is the future pool of talent. It is hard to find junior engineers already skillful enough in .NET to make an impact so labor costs are far cheaper in the open ecosystem.

Second, there is the issue of development tools like Eclipse, Emacs, IntelliJ, Google Web Toolkit etc.  I don’t know many awesome developers that like to use much else.

And thirdly, do a quick search for “Java user group” on Google — 554,000 users. Search for .NET user group and you get 319,000, and that is just for Java.  When you start looking at all of the other communities out there, open source is a much bigger movement. They also tend to be groups that are always pushing the edge of what is possible.

But I think it really comes down to these two key differentiators: Do you want the power of choice? And do you want the ability to use bleeding edge technologies?

Feature image courtesy of Flickr user traviscrawford. Hurricane image courtesy of Flickr user Bernt Rostad.