Microsoft makes its cloud data move

It’s taken Microsoft quite a while to get traction in the cloud, and even longer for it to get its cloud data story right. For the longest time, things weren’t looking good. I say that as someone who has worked with – and at various times championed – Microsoft technology for most of my career. As much as I’ve wanted Microsoft to do well in the cloud data arena, I thought it was doomed to an eternity of near misses.

Fast forward

But things have been steadily improving since the summer, especially in the last few weeks. The glass that was half empty in the spring is now nearly full, with a complete HDInsight Big Data service based on Hadoop 2.0; an able machine learning service called Azure Machine Learning; a document store NoSQL database called DocumentDB; a publish-subscribe service for capturing streaming data called Event Hubs; a service for processing and analyzing that data called Azure Stream Analytics; a data transformation workflow service called Data Factory; and an eponymous Search service based on ElasticSearch at its core.

Beyond all of these “house brand” products, partnerships announced in the past two weeks mean that customers can or will soon be able to spin up Hadoop clusters based on Cloudera’s Distribution of Hadoop (CDH) and Hortonworks Data Platform (HDP), running on either Linux or Windows; IBM’s Cloudant NoSQL database, based on BigCouch and Apache CouchDB, is also available; and so is IBM’s relational database standby, DB2. Oracle and DataStax provide access to Oracle 12c and Cassandra on Azure, and other partners allow customers to run MySQL, PostgreSQL and MongoDB.

Competitive landscape

Azure competes well with Google Compute Engine’s Cloud DataFlow service and parts of its BigQuery service. In numerous other areas, Mountain View has some work to do to catch up with Redmond. But what about the cloud juggernaut, Amazon Web Services (AWS)?

Amazon’s Relational Database Service (RDS), Elastic MapReduce (EMR), DynamoDB, Kinesis, Data Pipeline and CloudSearch each now have opposite numbers in the Azure camp. Amazon does not yet have a service to compete with Azure Machine Learning. On the other hand, its gangbuster-growth service Redshift has no answer from Redmond (although a competitive offering from BitYota is now available on the Azure platform as well as on AWS).

Some of the pieces that have filled out the Azure data story are in preview while others are in general availability. Many of them appeared in just the last 4 months. The drinking water on the east side of Seattle is normally very good, but it seems like something extra got in the reservoir this summer.

What went right?

How has such a seemingly rapid improvement taken place? To begin with, Corporate Vice President Scott Guthrie has been laying the groundwork for this for years, pushing to make Azure more innovative and easier to use, and putting it on a path of accelerated iterative improvement. As Guthrie now runs Microsoft’s entire Enterprise and Cloud (E & C) division, and succeeded his boss, CEO Satya Nadella, in that role, support for this innovation and continuous improvement is coming all the way from the top.

Opening up Azure to offer an infrastructure as a service tier, and that tier’s accommodation of Linux, has paved the way for numerous partnerships that would not have been possible otherwise. Oracle, Cloudera, and IBM are but three examples. And the partners aren’t just coming because of operating system compatibility; they’re coming because of an open, apolitical attitude and business spirit that the old guard at Microsoft just couldn’t muster.

Will continuous improvement continue?

As good as all this is, Microsoft still has some loose ends to tie down. A data story isn’t complete without data discovery, modeling, and visualization, and right now that’s all tied up in the Power BI offering on Microsoft’s other cloud, Office 365. The Power BI subscription is available as a standalone subscription starting at $480/user/year or an add-on to the Office 365 E3/E4 subscriptions for an additional $240/user/year, but the latter is a promotional price. Worse yet, the basic reporting story which used to be available in the Azure SQL Reporting service hasn’t been added at all yet to Power BI. Azure SQL Reporting was fully shut down on Friday — for some customers, that was more trick than treat.

Personnel at Microsoft are fond of describing full coverage of something as “all up.” If Microsoft wants a good all-up cloud data story, then Power BI should either move to Azure or be a much more welcome guest there. Customers who are already using and paying for big pieces of the Azure data stack should be welcome guests too, on the Power BI side.

Those customers shouldn’t be forced to pay up to $624/year or have a subscription to services like Exchange, SharePoint and/or Lync that they don’t necessarily need. Microsoft should also have more of its data services available on-premises, to facilitate hybrid scenarios, and generally give enterprise customers’ workloads an on-ramp to the cloud.

Talk back

But Microsoft is doing a lot of things right. And for customers and vendors we’re talking to, Azure is garnering attention as the most enterprise-friendly of all clouds. Put this all together, and Amazon and Google have some explaining to do.

With Google’s Cloud Platform Live event in San Francisco happening on November 4th and Amazon’s re:Invent event a week later in Las Vegas, we may get those explanations pretty soon. And given that both events are sold out, it seems lots of people will be listening. I have a feeling a few PCs in Redmond will be tuned in to the live streams.