Microsoft buys data science specialist Revolution Analytics

Microsoft has agreed to acquire Revolution Analytics, a company built around commercial software and support for the popular R statistical computing project. The open source R project is hugely popular among data scientists and research types, and having Revolution’s R experts in-house could be a big deal for Microsoft as it tries to establish itself as the go-to place for data science software.

Among Revolution’s additions to the standard R capabilities were simplifying the use of the program and engineering it to run across big data systems such as Hadoop. Here’s how Joseph Sirosh, Microsoft’s corporate vice president for machine learning, explains what the deal means in a blog post:

As their volumes of data continually grow, organizations of all kinds around the world – financial, manufacturing, health care, retail, research – need powerful analytical models to make data-driven decisions. This requires high performance computation that is “close” to the data, and scales with the business’ needs over time. At the same time, companies need to reduce the data science and analytics skills gap inside their organizations, so more employees can use and benefit from R. This acquisition is part of our effort to address these customer needs.

. . .

This acquisition will help customers use advanced analytics within Microsoft data platforms on-premises, in hybrid cloud environments and on Microsoft Azure. By leveraging Revolution Analytics technology and services, we will empower enterprises, R developers and data scientists to more easily and cost effectively build applications and analytics solutions at scale.

Sirosh will be speaking at Gigaom’s Structure Data conference, which takes place March 18-19 in New York.

A simple example of a plot using R.

A simple example of a plot using R.

In the blog post, Sirosh also promised to continue contributing to the open source R community, as well as to continue developing Revolution’s products. He reiterated Microsoft’s renewed (or just plain new) commitment to open source software, which includes contributions to various Hadoop-related projects and support for many open source technologies on the Azure platform.

In a separate blog post, Revolution’s David Smith detailed Microsoft’s specific commitment to R, including within the Azure Machine Learning service it announced in June:

And Microsoft is a big user of R. Microsoft used R to develop the match-makingcapabilities of the Xbox online gaming service. It’s the tool of choice for data scientists at Microsoft, who apply machine learning to data from Bing, Azure, Office, and the Sales, Marketing and Finance departments. Microsoft supports R extensively within the Azure ML framework, including the ability to experiment and operationalize workflows consisting of R scripts in MLStudio.

When Microsoft CEO Satya Nadella went on a cloud computing road show in October, touting the scale of Microsoft’s cloud efforts, I argued that applications, not scale, would always be Microsoft’s big advantage in that space. The same holds true for the world of big data and data science.

Revolution Analytics and the R project might not be household names in most circles, and they certainly won’t be a major driver of Microsoft revenue any time soon, but they are a big deal in the world predictive analytics and machine learning. That’s an emerging market that Microsoft wants to get in on early, while so many other vendors are still pushing yesterday’s technologies or focused on building out infrastructure to store all the data companies want so badly to analyze.

Online gambling: lessons in marketing tech and customer behavior

Online gaming–we’re talking betting, not Angry Birds—is in some ways an ideal laboratory for the study of customer behavior.

What promotional offers provide the most effective incentives for bettors to increase their deposits or frequency of play? How are customers best carved into ever-smaller micro-segments for campaigns that maximize their participation and lifetime revenue value?

But online betting is also a rare market in that it was established in Europe and Israel 10 to 15 years before it really gained a foothold in the U.S. So maybe it’s not surprising that when U.S.-based BAMSAS/, which provides online betting for horse racing and fantasy sports, was looking for a marketing automation platform especially for email campaigns, they settled on an Israeli startup that first gained traction in the European online gaming market.

What was it that drew BAMSAS to Optimove? According to BAMSAS VP of Marketing Pete Laverick, the firm considered a half dozen of the usual suspects, including Oracle Eloqua, Marketo, and ExactTarget’s Pardot (now owned by Salesforce). Yes, it was convenient that Optimove had so many gaming customers, but what the other vendors really couldn’t match for him was the predictive capabilities that were integral to the Optimove system. (BAMSAS and Optimove further integrate their systems with SilverPop as well.)

Watching behavior, maintaining privacy

What’s interesting about Optimove is that their constantly refined predictive testing of microsegment or individual customer responses and future activity is entirely based on account activity, a lesson we’ve seen from games and applied to news. How much was bet when? And, what deposits have been kept on balance? Optimove’s SaaS product doesn’t actually collect or use any customer-identifying demographic data, which eases lots of privacy concerns. It is entirely based on behavior, and it is constantly learning from previous campaigns. (BAMSAS does integrate minimal information on location and age as part of its meeting legal requirements, but has otherwise used only limited demographic information to date.)

Pete says that much of the American online gaming market appears to be where the European market was a decade ago, in that companies are primarily focused on attaining customers, rather than on the retention and optimization of customers that they have. Having worked in the European market as the shift was made to retention and optimization, he knew that that was where the greater challenge and reward are, and so he sought a system that could learn and deliver on that front.

BAMSAS has over 150,000 customers and Pete estimates that they receive 750,000 emails a month, in aggregate, from the company. BAMSAS does not yet have completely individualized learning and campaigns for all of their customers, but many campaigns are refined down to microsegments of only two or three individuals.

The system is constantly refining just right premium to offer when for each microsegment. The goal is to convert a free customer to a paying one, nudge a paying customer to up his or her deposit balance, or increase his or her frequency or scale of betting. Customer conversion and retention are paramount, and an ever-refined projected lifetime value for each customer segment is always factored into the equation.

Finally, yes, this was a product selection that was driven and made by the VP of marketing, rather than the IT department. How has integration with IT gone? Pete says all is well and the only impediment they had was finding a need to clean up their data, which was a good and right move for them to make anyway.

Researchers say AI prescribes better treatment than doctors

Two Indiana University researchers have developed a computer model they say can identify significantly better and less-expensive treatments than can doctors acting alone. It’s just the latest evidence that big data will have a profound impact on our health care system.

Big data politics: Why you can’t outrun campaigns by avoiding the TV

Campaigns have been profiling potential voters for decades, but the glut of data available online changed the game in terms of how much they collect and how it’s used. Now, thanks to complex models and real-time ad platforms, poltiical advertising is becoming a personal affair.

Can Kaggle make data science a spectator sport?

Data science competition platform Kaggle is opening up the leaderboards for its invitation-only private competitions, meaning anyone can watch and see how the world’s best data scientists are faring in these special challenges. Can data science actually become a spectator sport in the analytics community?