What Amazon and Its Customers Can Learn From Last Week’s Outage

Last week’s Amazon Web Serivces outage unleashed a torrent of speculation from technology pundits and the mainstream media, and opinion appears surprisingly divided as to where any blame should lie. Problems, which affected part of one of AWS’ five global data centers, began early on Thursday, and, thanks to a lack of detailed information on what was wrong or how it could be fixed, a small number of companies were still struggling days later as Amazon attempted to restore data from backups.

There doesn’t seem to be much room for doubt that Amazon is at least partly responsible. The failure should never have cascaded as far or as long as it did. Amazon describes its Availability Zones, into which it divides each of its Regions (data centers), as “distinct locations that are engineered to be insulated from failures in other Availability Zones.” Yet this outage initially affected at least two of these zones. Information dissemination was poor, and normally vocal champions within the company went silent. At the time of writing, the Amazon Web Services Blog still doesn’t even acknowledge that there was ever an issue.

But while Amazon may be responsible for the initial failure, and for a lack of communication while it was being resolved, it’s also clear that a number of its customers had a far harder time than they needed to because of how their services are designed and operated. As Derrick Harris noted earlier this week, Twilio, SmugMug and Netflix were among those companies to emerge almost unscathed, and this wasn’t due to luck. It was a philosophy of system design that understood the power — and the limitations — of using a commodity service like Amazon’s. Cloud computing consultant Ben Kepes notes that “highlight has been made of the need to think beyond one zone, one data center, one region and one provider to build a robust and resilient service.” InfoWorld columnist David Linthicum, meanwhile, agrees: “You have to plan and create architectures that can work around the loss of major components to protect your own services.” This is simply good practice in designing any IT system, which is why it is bizarre that so many simply abdicated their responsibility and left it all to Amazon. Foursquare, Reddit, Quora and hundreds more suffered greatly because of failings in Amazon’s data center. Might their suffering have been lessened if they’d planned ahead in a similar way to Netflix or Twilio?

Amazon has said it is reviewing last week’s outage in order to understand what went wrong. But the company must also play a far more proactive role in teaching its customers about ways applications can cost-effectively take advantage of the cloud, as well as how to respond to outages, failures and other problems in the underlying infrastructure. Whatever mistakes Amazon’s customers may have made, whatever penny pinching they attempted to cut cloud services down to the cheapest, least fault-tolerant configuration they could get away with, the initial fault must lie with Amazon. Poorly architected customer systems would not have been pushed to failure if Amazon’s underlying infrastructure had continued to perform as expected. Maybe some long-term good can come from short-term pain and embarrassment.

Question of the week

Does the Amazon outage make you less likely to use the cloud?

Whose HD Streams Reign Supreme?

A number of the major online video services have added an HD option for video uploads, but which one is the best? CNET’s Josh Lowensohn did a really great round-up of HD services and determined YouTube to be the best. Even better, Lowensohn created this handy-dandy chart so you can compare them.

One thing CNET didn’t look at is upload and processing times, which probably is a good thing for YouTube because in my tests with them, I’ve run into major errors on those fronts. And there were a couple services CNET missed, like Motionbox (which the article cops to) and Viddyou. But overall, this is a good starting point, and Lowensohn’s accompanying post is worth a read.

hd_chart

How To Stand Out in a Sea of Storage Startups

Online storage companies pop up more frequently than mushrooms after a downpour in Southern France. And like the wild-growing fungus, not all of them are easily digested. Case in point: AOL’s Xdrive, which despite corporate backing recently joined the likes of Omnidrive on the proverbial technology garbage dump. So how does one survive in this sea of startups? Continue Reading & Find Out.

The Crunchies College: B-lessons from the winners

On Friday night TechCrunch, VentureBeat, Read/WriteWeb and GigaOM cosponsored the 2007 Crunchies awards in San Francisco. It was a great event, and in case you couldn’t attend, you can catch the video here.
The line-up of finalists in categories like ‘best bootstrapped startup’, ‘Best use of viral marketing’ and ‘Best founder’ was stellar — which speaks only more highly of the winners themselves.
We’ve written about or had contributions from many of them in earlier posts, so we’re using this week’s Found|LINKS to highlight a few of the success attributes and business lessons you can take from this assembly of high-achieving startups. To say this is a knowledge-rich group is an understatement. So settle in for this crash-course from the Crunchies School of Founding. Read More about The Crunchies College: B-lessons from the winners

SmugMug Adds HD Video Support

SmugMug is one of those rare Web 2.0 companies that has the temerity to charge for their photo-hosting and sharing service. A poster child among startups that are utilizing Amazon.com’s Web Services, SmugMug has built a reputation as a quirky but well-managed startup that is profitable and is said to have annual revenues of around $10 million a year.

How do they do it? By offering a dead simple and easy-to-use service, which starting today includes HD video-hosting as part of the power user and pro-package. SmugMug is going to allow you to upload and share HD videos via its service, which can be played back using QuickTime. The new higher-definition video support means that videos will be accessible over a plethora of devices, including PlayStation 3, the iPhone and the iPod. The maximum size of video support: 1280×720 video, which makes it easy to watch on those lush plasma screens in the living room. Read More about SmugMug Adds HD Video Support