I apologize if I’m late to the game on this, but someone just tweeted me about Apache Tajo, a potentially interesting new SQL query engine for Hadoop. I’m not sure how much traction it can possibly gain given the glut of other options out there (take a look at this now extremely outdated roundup from February), but I guess more options are better for users, to a point. SK Telecom, a Korean carrier, is already a big fan. Also, some of Tajo’s contributors’ employers are kind of interesting.
We have been hearing about things like YARN and high availability for a few years — they’ve even been incorporated into some commercial Hadoop distributions — but now they’re finally part of the official Apache Hadoop code base. Technically version 2.2.0, “The project’s latest release marks a major milestone more than four years in the making, and has achieved the level of stability and enterprise-readiness to earn the General Availability designation,” according to an Apache Software Foundation press release.
Hortonworks is working to integrate the Storm stream-processing engine with its Hadoop distro, and hopes to have it ready for enterprise apps within a year’s time. It’s the latest non-batch functionality for Hadoop thanks to YARN, which lets Hadoop run all sorts of processing frameworks.
VMware is launching a new open source project, called “Serengeti,” that aims to let the Hadoop data-processing platform run on the virtualization leader’s vSphere hypervisor. VMware apparently smells a lucrative opportunity in Hadoop and isn’t about to miss out on getting a piece of the pie.
One year after launching into the Hadoop market with much anticipation, Yahoo spinoff Hortonworks finally has a product available. The company announced version 1.0 of its flagship Hortonworks Data Platform on Tuesday, as well as a High Availability version designed with new partner VMware.
As the world once again starts analyzing Yahoo’s myriad woes after Sunday morning’s ouster of embattled CEO Scott Thompson, I’m left wondering if its investment in Hadoop didn’t aid in the company’s demise, even if it’s a way down the long list of Yahoo’s mistakes.
Already a heavy user of Apache Software Foundation projects, Twitter is now giving back to the organization financially as a sponsor. It’s difficult to think of a situation where Apache sponsorship wouldn’t be the right move and, it’s definitely the right thing for Twitter to do.
In the wake of CloudStack’s announcement, Cloudscaling’s CTO, Randy Bias, takes issue with its claims of “AWS compatibility” and “true Amazon-style architecture.” According to Bias, “no one should be under the illusion that CloudStack is more AWS/Amazon compatible than any other open source cloud software.”
Rob Bearden, CEO of Hortonworks, the Hadoop startup that spun out of Yahoo in June 2011, knows a thing or two about making open source software profitable. And he thinks Hadoop has an opportunity to be bigger than the markets for JBoss, SpringSource and MySQL combined.
Cloudera and Hortonworks have been playing a game of oneupsmanship over the past few weeks in an attempt to prove whose contributions to the Apache Hadoop project matter most. Reputation matters to both companies, but maybe not as much as fending off encroachments to their turf.