All aboard the Hadoop money train

Market research firm IDC released the first legitimate market forecast for Hadoop on Monday, claiming the ecosystem around the de facto big data platform will sell almost $813 million worth of software by 2016. But IDC’s forecast doesn’t tell the whole story. Hadoop’s actual economic impact is likely much, much larger.

Viewed alone, IDC’s forecast offers impressive enough numbers — a $77 million Hadoop market growing at a 60 percent compound annual growth rate until it hits the $812.8 million mark in 2016. The number would be higher, the report concludes, if Hadoop’s open source status didn’t drive down the prices that vendors pushing proprietary products could charge. Indeed, with vendors such as Hortonworks pushing the all-open-source approach to Hadoop, others do have to keep their license fees in check.

According to separate emails from report co-authors Carl Olofson and Dan Vesset, IDC’s revenue calculations take into account software, maintenance and software-as-a-service revenue. That means they account for the usual suspects such as Cloudera, MapR and Hortonworks at the distribution layer (GigaOM Pro subscription req’d), as well as the myriad vendors working a layer above on Hadoop-based applications, databases and management tools. IDC also accounted for cloud-based Hadoop services such as Amazon Web Services'(s amzn) Elastic MapReduce (and, presumably, upstarts such as Infochimps and Mortar Data).

Time will tell if IDC’s forecast is accurate, but it’s definitely welcome. Hadoop isn’t a new version of the relational database, as some have suggested, but a whole new platform that sits beside and likely won’t replace legacy software. In theory, then, any money spent on Hadoop is new money. And because it began as an open source project and is coming of age in the cloud computing era, it’s not too easy to look to technologies past for guidance.

An impending Hadoop explosion in the cloud

It’s in the cloud where things could get particularly interesting. Olofson noted that he and Vesset estimated  “little if any revenue for [Elastic MapReduce] in 2011,” although I’m not certain I agree. That service is actually quite popular and accounts for some serious Hadoop use — some users run several thousand nodes at a time. If it’s generating even a few million dollars — just a fraction of AWS’s estimated overall revenue — that’s a not-insignificant piece of a $77 million Hadoop space.

But cloud computing is already having a meaningful impact on Hadoop in other ways that will only expand. One is as the de facto deployment model of choice for many web startups, more and more of which are finding a way to make big data a part of their business model. Whether they’re big data applications or just applications that use big data, they will likely use Hadoop, and they’re likely not going to pay a lot of money for it. If they’re not hosting and managing their own Apache Hadoop cluster, they’ll probably use a cloud-based Hadoop offering, which could mean significant growth in that segment of the ecosystem.

The externalities of Hadoop

Moreover, every company — startup or established — that offers a service powered in some part by Hadoop adds to the platform’s overall economic impact. Hadoop is a big data storage-and-processing framework as well as a positive-externality generator. Facebook, Twitter, Yahoo (s yhoo), Etsy, ipTrust, BloomReach, SkyboxClimate Corporation, Zions Bancorporation — every sale made, fraud thwarted or page view generated thanks to Hadoop means a healthier economy. The dollar amount directly attributable to Hadoop probably isn’t calculable, but it’s likely rather large and growing.

It’s difficult to even fathom the external effects of Hadoop in 2016. A projected $812.8 million market means Hadoop is becoming fairly ubiquitous, especially if one considers how many companies are using free software in addition to those paying for it. Already, I’ve been told, the majority of Fortune 500 companies are at least experimenting with Hadoop. It might not be a godsend, and might be just the starting point of a meaningful big data strategy, but Hadoop is going to have a major impact on how businesses do business.

Image courtesy of Flickr user RachScottHalls.