Apache Drill has graduated to Top-Level Project status. Is it merely ready for primetime or will it succeed there? And does the Big Data world need another SQL-on-Hadoop engine?
MapR throws Apache Drill in the box, Cassandra 2.1 releases, and that’s just the beginning.
MapR has integrated the Apache Drill SQL-on-Hadoop engine into its big data platform. MapR led the development of Drill, which is part of a larger movement within the company toward building a stronger open-source culture.
Proprietary Hadoop distro-provider MapR is behind a new open source Apache project, called Drill, to re-engineer Google’s Dremel software, a tool for querying big datasets really fast. Interestingly, MapR is taking an open source approach this time around, which is telling.
MapR officials claim that in 2009 the company had no choice but to switch to a closed source model for its Hadoop distribution as the Hadoop community at that time was unwilling to make the changes MapR requested. MapR went ahead and solved some major holes in the Hadoop code around reliability, giving it a head start in the enterprise. But those issues have been mostly fixed in the open source code today. Now MapR must race to stay ahead of the hundreds of developers focused on the open source distributions, which is a tall order for a small company. Or it could spearhead a new open source initiative of its own. Step forward, Drill.
My guess is that MapR did not foresee the momentum that would be created by today’s successful open source initiatives and how this can power businesses, and Drill is a way for it to get back into the game.
Open source software used to lag proprietary software innovation, but the tables have turned quickly with the advent of cloud computing. Now open source is driving the innovation and since cloud is the new model for IT, open source is becoming an increasingly dominant part of all new software development.
Vendors (with open source products and business models) claim that the shift to open source is being driven by customer adoption of open source products. They say customers don’t want cloud technology to become the next monopoly. That’s bunk in my opinion. Customers rarely know what they want. I think these vendors believe that the only way to unseat the current monopolies in IT (Oracle, IBM, Cisco, Microsoft, VMware etc) is to create an open source community of like-minded developers to rail against the old guard; and via a freemium model, seed enough of the software into the market, that it eventually catches on. They might be right. It’s working for Cloudera and HortonWorks (open source Hadoop distributions); it’ll probably work for OpenStack (open source IaaS); Ubuntu is the most popular OS in the cloud (shepherd by Canonical), and WordPress (is helping you read most of the web). The list goes on.
The Drill project is also a way for MapR to lure talented developers to the company where they will get the chance to work on what might be the next big application in the Hadoop ecosystem. Being open source means the project develops faster than any proprietary effort could, and it’s a system of collaboration, which young developers are now more comfortable with than working in a proprietary, closed environment.
MapR isn’t the only company behind Drill. Drawn to Scale and Concurrent are also backing the project and no doubt also looking for ways to attract talented developers with Hadoop skills to their organizations. Let’s see who else jumps aboard.
Disclosure: Automattic, the company behind WordPress, is backed by True Ventures, a venture capital firm that is an investor in the parent company of this blog, Giga Omni Media. Om Malik, founder of Giga Omni Media, is also a venture partner at True.