Google open sources a MapReduce framework for C/C++

Google announced on Wednesday that the company is open sourcing a MapReduce framework that will let users run native C and C++ code in their Hadoop environments. Depending on how much traction MapReduce for C, or MR4C, gets and by whom, it could turn out to be a pretty big deal.

Hadoop is famously, or infamously, written in Java and as such can suffer from performance issues compared with native C++ code. That’s why Google’s original MapReduce system was written in C++, as is the Quantcast File System, that company’s homegrown alternative for the Hadoop Distributed File System. And, as the blog post announcing MR4C notes, “many software companies that deal with large datasets have built proprietary systems to execute native code in MapReduce frameworks.”

This is the same sort of rationale behind Facebook’s HipHop efforts and database startup MemSQL, whose system converts SQL to C++ before executing it.


MR4C was developed by satellite imagery company Skybox Imaging, which Google acquired last June, and was optimized for geospatial data and computer vision code libraries. Of course, open sourcing MR4c presents the opportunity to open up this capability to a broader range of users, either working in fields dominated by C libraries or those who just don’t like or aren’t comfortable writing programs in Java. When Google announced its open-source Kubernetes container-management system last year, it was quickly ported from Google Compute Engine to run in several other environments.

It will be interesting to see how much traction MR4C gets at this point, especially given the surge in interest around Apache Spark. Spark is a faster data-processing framework than MapReduce, already has a lot of interest, and natively supports Scala, Python and Java, although it does not support C/C++.

The future of Hadoop and big data processing will certainly be a big topic of conversation at our Structure Data conference next month in New York, which features Google VP of infrastructure Eric Brewer, Spark co-creator (and Databricks CEO) Ion Stoica and the CEOs of all three major Hadoop vendors.

Amazon Web Services gets with the Golang program

Amazon Web Services already offers software development kits (SDKs) for Java, C#, Ruby, Python, JavaScript, PHP and Objective C programming languages. Now it says it will add Go (aka Golang) to that list. More accurately, it says it’s taken over aws-go, an SDK developed by Stripe. The SDK is in a sort of beta stage, with work continuing.

While Go doesn’t have nearly as many users as Java or C — an IEEE Spectrum survey ranked it as the nineteenth most popular language between Scala and Arduino — it’s gained traction among developers. That’s especially true for those building cloud or web services infrastructure. Docker, which has taken the development world like a gale force wind, is written in Go, for example. This slideshow explains why.

Just ask Derek Collison, Founder and CEO of Apcera who called Golang’s rise to fame more than two years ago.

Asked if he still holds that opinion, he was unequivocal. “[Go] got quite a bit right with the language for the Get Sh*t Done crowd, it appeals to them and anyone doing distributed systems or cloud platforms … should at least consider it,” he said via email.  Apcera, was built on Go and 95 percent of its code-base is Go, he said, adding: “We made this decision over Node.js which has had its struggles as of late on the community side.”

Sendgrid is moving over to Golang from Perl, and here’s a succinct rationale by Sendgrid co-founder Tim Jenkins, who said Go, which incubated at [company]Google[/company], facilitates concurrent programming so that calculations can execute at the same or overlapping times rather than one at a time.

While what we do isn’t rocket science, doing it at a scale of over 500 million messages per day is extremely challenging … One of the most compelling reasons for using Go at SendGrid is having the concept of concurrent asynchronous programming as part of the language. The argument always came up that you can do asynchronous programming in Java, but it isn’t pretty. My argument was always that ‘We can keep doing it in Perl,’ which usually helped people to understand that just because you can do something with a technology doesn’t mean its the best way to do it.

[company]Amazon[/company], which wants to be the default tool provider to developers everywhere, is filling in an important checkbox here.

This story was updated at 8:10 a.m. with Derek Collison’s comments.

Progress shells out $262.5M for Bulgarian app development tools provider Telerik

Progress Software, the Bedford, MA-based enterprise software infrastructure firm, is buying the Bulgarian UI framework and app development tools outfit Telerik for $262.5 million, to help its customers make nicer user interfaces for their cloud and on-premise apps. Telerik is used by 1.4 million developers, including those at 450 of the Fortune 500 companies – founded in 2002, it started with a focus on Microsoft’s .NET platform before expanding to other platforms under the name “Kendo UI” (it consolidated its brands last year.) According to Progress CEO Phil Pead, the acquisition will make his company “a destination site for the largest developer population on the planet – ABL, .NET, Java, JavaScript, Node.js and mobile.”