Microsoft queues up DocumentDB for broad availability

Microsoft continues to fill in the check boxes for its Azure cloud. Example: Azure DocumentDB, Microsoft’s take on NoSQL databases a la Couch or MongoDB, will be generally available April 8, the company said Thursday.

The beauty of these document databases is they can ingest JavaScript Object Notation (JSON) formatted information as is — no need for the mapping process that had to occur to pump them into relational SQL databases. [company]Amazon[/company] Web Services added JSON support to its DynamoDB database last last year.

[company]Microsoft[/company] announced DocumentDB in August.

Microsoft, which is trying to kit out Azure as a comfy home to a wide variety of workloads, supports a variety of homegrown and third-party databases including MongoDB and Microsoft SQL Azure and Oracle.

Microsoft also said Azure Search, which works across more than 50 languages, is now available. This “search-as-a-service” targets developers who want to add full-text search into their applications.

The company also unveiled a new premium encoder for Azure Media Services.

For a primer on DocumentDB check out the video below.

[protected-iframe id=”4448a4bcd0bd8c2c93244aa57b76ff78-14960843-26974994″ info=”//channel9.msdn.com/Shows/Azure-Friday/Azure-DocumentDB-101-with-Ryan-CrawCour/player” width=”960″ height=”540″ frameborder=”0″]

If you thought cloud competition couldn’t get hotter, think again

Chinese e-commerce giant Alibaba has opened a data center hub in Silicon Valley, adding yet another gigantic player to a growing, but already hotly-contested cloud computing market.

Aliyun, Alibaba’s cloud computing arm, has been likened to Amazon.com’s Amazon Web Services unit and you can bet that [company]Amazon[/company], as well as [company]Google[/company] and [company]Microsoft[/company], are watching this development closely. Those American cloud giants are focused on boosting business and operations outside the U.S. — Microsoft and Amazon have presence in China, for example, and now Aliyun will return the favor with its first US-based data center.

The initial plan is for the Aliyun data center, the exact location of which was not disclosed, to target Chinese companies based in the U.S. and to expand from that base. In a statement Aliyun VP Ethan Sicheng Yu said:

… the ultimate objective of Aliyun is to bring cost-efficient and cutting-edge cloud computing services to benefit more clients outside China to boost their business development.

The U.S. expansion comes at an interesting time politically as well — relations are tense between the Chinese and U.S. governments and both sides have accused the other of spying on each other and using native tech companies to help in this effort.

Aliyun’s current data centers are in Hangzhou, Qingdao, Beijing, Shenzhen and Hong Kong.

For retailers the buy-or-build cloud decision looms large

If you need proof that cloud deployment stories can touch off religious disputes, my recent report about @Walmartlabs deploying 100K cores of OpenStack to run the retail giant’s e-commerce operations is Exhibit A.

This is, by any measure, a massive private cloud, and some readers were incredulous that [company]Walmart[/company] would go this route instead of plying public cloud services. It’s the old build versus buy discussion all over again, with many of the participants weighing in on the “buy” side.

One reader, termed this decision “ridiculous,” pointing out that @walmartlabs has hired on 1,000 or so engineers over the past year — although no one said all those people were dedicated to building or maintaining the aforementioned OpenStack private cloud. Still the argument is, if you go with public cloud, you won’t need to bring that much expensive talent in house. Engineering talent is pricey, especially in Silicon Valley. @walmartlabs is headquartered in San Bruno, Calif.

Wal-Mart StoreHis opinion is that a big retail outfit is far better off using “out of the box” public cloud capabilities for much of its work rather than reinventing the wheel (or building its own cloud.) For this camp, Walmart’s decision to build a customizable and flexible cloud with OpenStack makes no sense.

On the other hand, private cloud (and OpenStack) proponents noted joyously that Walmart’s work proves “private cloud deniers” wrong. (Does anyone else find that phrase disturbing? It brings to mind thought of climate change and holocaust deniers and seems to lack a sense of proportionality but back to the topic.)

Server Density CEO David Mytton, a buy sider, wrote about the Walmart private cloud here. Bottom line, he said Walmart is:

dedicating significant resources to building their own “private cloud” and although it’s true there is no specific vendor lock-in, they are locked into their own development. They’re competing in resources, talent and innovation against the public cloud providers (who have more resources to dedicate to engineering both product features and efficiency at scale).

Anybody but AWS?

Remember, given the competitive retail landscape, Walmart was hardly likely to run Amazon Web Services public cloud seeing as how Amazon.com is seen as Darth Vader by many of the rest of the retail universe. Target used Amazon.com (not AWS) for infrastructure but left the fold in 2011.

AWS would likely point out, if it were prone to comment on such things, that its cloud business is run as a separate entity than [company]Amazon.com[/company] — [company]Netflix[/company] is a huge customer after all and Amazon also runs Amazon instant video. But I’ve talked to other retailers who, off the record, will point to the political incorrectness of turning over key retail functions to Darth, er AWS.

Jeff Aden, co-founder of 2nd Watch, a systems integrator that works with customers to deploy AWS, said his company has several retail customers running on AWS, including Diane Von Furstenberg. Other AWS retail users include Gilt.com and Nordstrom Rack.

Mytton, conceded that AWS might be a tough sell for a big reseller to use, but why not throw in with [company]Google[/company] Cloud Platform or [company]Microsoft[/company] Azure? He points out that Ocado, the big British retailer is a Google cloud customer.

Last week I spoke with Sudhir Hasbe, director of software engineering BI and data services for Zulily, a members-only online fashion retailer that has fully embraced Google cloud services — BigQuery, Google Storage and Google Compute Engine. In this, Zulily is sort of a counter-narrative to the @Walmartlabs story.

Zulilly puts 9,000 new items on its site daily but wants to make sure it displays only the items that are relevant an potentially of interest to a given shopper. If you’re a woman who shops for herself and maybe a 6 year old boy, then she’ll see options for those demographics and not have to wade through the rest. “Search doesn’t work well in retail,” Hisbe said.

“For this we need the full big data platform so we can perform maximum data processing– what preferences do they have, what do they like. It also means when you have that much data, the whole supply chain side needs to consume it to make decisions,” he noted.

What’s nice about deploying Hadoop clusters on GCE, is that once the processing has run, the data is pushed into BigQuery where it’s available to all the business units and analysts, and the bill for Hadoop processing stops. The data is all stored in inexpensive Google Storage.

Anyway, feel free to comment on when and in what circumstances it makes sense to deploy public cloud or BYO private cloud. But please keep it polite.

Agree or not, Mark Cuban’s take on net neutrality is worth a listen

For those who missed, Mark Cuban visited the Structure Show last week to re-iterate/explain his thinking on net neutrality and why he thinks turning over internet governance to the FCC is a big mistake. Check it out below.

[soundcloud url=”https://api.soundcloud.com/tracks/193100656″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Cloud options mean decisions, decisions for IT buyers

Much has been written about cloud consolidation, with M&A roiling the cloudscape over the past few months: Cisco bought Metacloud, EMC bought CloudscalingHP snapped up Eucalyptus. Despite all that, cloud deployment options abound, and choice will be a big theme at the upcoming Structure 2015 event, this June in San Francisco.

First, there is more choice than ever in public cloud. Sure, Amazon Web Services leads the market-share race by a wide margin. But viable options are available — from Microsoft Azure to Google Cloud Platform to vCloud Air to Digital Ocean to CenturyLink. What many of us tend to forget is that, despite all the cloud talk, we’re still very early in the game when it comes to business deployment. There’s a ton of opportunity out there. Is it enough to float all boats? That’s the zillion-dollar question.

We will discuss those options, and how even the biggest enterprises — General ElectricWalmart — are deploying more of their IT on cloud. The question is no longer if, but when.

At this year’s event, we’ll welcome back [company]Amazon[/company] CTO Werner Vogels, Khosla Ventures founder Vinod Khosla, [company]Microsoft[/company] EVP Scott Guthrie, Google SVP Urs Hölzle, Battery Ventures technology fellow Adrian Cockcroft and DataGravity CEO Paula Long.

We’ll hear from first-timers, too: Canonical founder Mark Shuttleworth, Digital Ocean CEO Ben Uretsky, CoreOS CEO Alex Polvi. And, on the end user side, we’re really excited to bring on stage National Football League CIO Michelle McKenna-Doyle, FBI CISO Arlette Hart and Pinterest head of engineering Michael Lopp. More names to come.

For a refresher of last year’s event, here’s a sampling of some favorite sessions:

Google’s Urs Holzle:

[youtube https://www.youtube.com/watch?v=I9R4P0TLViA]

Facebook’s Jay Parikh:

[youtube https://www.youtube.com/watch?v=F9FYTbxWK1o]

Intel SVP Diane Bryant:

[youtube https://www.youtube.com/watch?v=HTXuwqLUw7M]

Amazon’s Werner Vogels:

[youtube https://www.youtube.com/watch?v=oZPlr2-KMnw]

Microsoft’s Scott Guthrie:

[youtube https://www.youtube.com/watch?v=TImzXnUaO0A]

Remember when machine learning was hard? That’s about to change

A few years ago, there was a shift in the world of machine learning.

Companies, such as Skytree and Context Relevant, began popping up, promising to make it easier for companies outside of big banks and web giants to run machine learning algorithms and to do it at a scale congruent with the big data promise they were being pitched. Soon, there were many startups promising bigger, faster, easier machine learning. Machine learning became the new black as it became baked into untold software packages and services — machine learning for marketing, machine learning for security, machine learning for operations, and on and on and on.

Eventually, deep learning emerged from the shadows and became a newer, shinier version of machine learning. It, too, was very difficult and required serious expertise to do. Until it didn’t. Now, deep learning is the focus of numerous startups, all promising to make it easy for companies and developers of all stripes to deploy.

Joseph Sirosh

Joseph Sirosh

But it’s not just startups leading the charge in this democratization of data science — large IT companies are also getting in on the act. In fact, Microsoft now has a corporate vice president of machine learning. His name is Joseph Sirosh, and we spoke with him on this week’s Structure Show podcast. Here are some highlights from that interview, but it’s worth listening to the whole thing for his take on Microsoft’s latest news (including support for R and Python in its Azure ML cloud service) and competition in the cloud computing space.

You can also catch Sirosh — and lots of other machine learning and big data experts and executives — at our Structure Data conference next month in New York. We’ll be highlighting the newest techniques in taking advantage of data, and talking to the people building businesses around them and applying them to solve real-world problems.

[soundcloud url=”https://api.soundcloud.com/tracks/191875439″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed

Why the rise of machine learning and why now

“I think the cloud has transformed [machine learning], the big data revolution has transformed it,” Sirosh said. “But at the end of the day, I think the opportunity that is available now because of the vast amount of data that is being collected from everywhere . . . is what is making machine learning even more attractive. . . . As most of behavior, in many ways, comes online on the internet, the opportunity to use the data generated on interactions on websites and software to tailor customer experiences, to provide better experiences for customers, to also generate new revenue opportunities and save money — all of those become viable and attractive.”

Asked why whether all of this is possible without the cloud, Sirosh thinks it is, but — like most things —  it would be a lot more difficult.

“The cloud makes it easy to integrate data, it makes it easy to, in place, do machine learning on top of it, and then you can publish applications on the same cloud,” he said. “And all of this process happens in one place and much faster, and that changes the game quite a bit.”

Deep learning made easy and easier

Sirosh said he began his career in neural networks and actually earned his Ph.D. studying them, so he’s happy to see deep learning emerge as a legitimately useful technology for mainstream users.

“My take on deep learning is actually this,” he explained. “It is a continuing evolution in that field, we just have now gotten to the level where we have identified great algorithmic tricks that allow you to take performance and accuracy to the next level.”

Deep learning is also an area where Microsoft sees a big opportunity to bring its expertise in building easily consumable applications to bear. Azure ML already makes it relatively easy to train deep neural networks using the same types of methods as its researchers do, Sirosh noted, but users can expect even more in the months to come.

“We will also provide fully trained neural networks,” he said. “We have a tremendous amount of data in images and text data and so on inside of Bing. We will use our massive compute power to learn predictive models from this data and offer some of those pre-trained, canned neural networks in the future in the product so that people will find it very easy to use.”

A set of images that the Microsoft system classified correctly.

The results of a Microsoft computer vision algorithm it says can outperform humans at some tasks.

How easy can all of this really be?

As long as there are applications that can hide its complexity, Sirosh has a vision for machine learning that’s much broader than even the world of enterprise IT sales.

“Well, we are actually going after a broad audience with something like machine learning,” he said. “We want to make it as simple as possible, even for students in a high school or in college. In my way of thinking about it, if you’re doing statistics in high school, you should be able to use [a] machine learning tool, run R code and statistical analysis on it. And you can teach machine learning and statistical analysis using this tool if you so choose to.”

Is Microsoft evolving from an operating system company to a data company?

Not entirely, but Sirosh did suggest that Microsoft sees a shift happening in the IT world and is moving fast to ride the wave.

“I think you should even first ask, ‘How big is the world of data to computing itself?'” he said. “I would say that in the future, a huge part of the value being generated in the field of computing . . . is going to come from data, as opposed to storage and operating systems and basic infrastructure. It’s the data that is most valuable. And if that is where in the computing industry most of the value is going to be generated, well that is one place where Microsoft will generate a lot of its value, as well.”

Microsoft’s machine learning guru on why data matters sooooo much

[soundcloud url=”https://api.soundcloud.com/tracks/191875439″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Not surprisingly, Joseph Sirosh, has big ambitions for his product portfolio at Microsoft which includes Azure ML, HDInsight and other tools. Chief among them is making it easy for mere mortals to consume these data services from the applications they’re familiar with. Take Excel for example.

If a financial analyst can, with a few clicks, send data to a forecast service in the cloud, then get the numbers back, visualized on the same spreadsheet, that’s a pretty powerful story, said Sirosh who is corporate VP of machine learning for Microsoft.

But as valuable as those applications and services are, more and more of the value to be derived from computation over time will be the data itself, not all those tech underpinnings.  “In the future a huge part of the value generated from computing will come from the data as opposed to storage and operating systems and basic infrastructure,” he noted on this week’s podcast. WHich is why one topic under discussion at next month’s Structure Data show will be who owns all the data flowing betwixt and betweeen various systems, the internet of things etc.

When it comes to getting corporations running these new systems [company]Microsoft[/company] may have an ace in the hole because so many of them already use key Microsoft tools — Active Directory, SQL Server, Excel. That gives them a pretty good on-ramp to Microsoft Azure and its resident services. Sirosh makes a compelling case and we’ll talk to him more on stage at Structure Data next month in New York City.

In the first half of the show, Derrick Harris and I talk about the Hadoop world has returned to its feisty and oh so interesting roots. When Pivotal announced its plan to offload support of Hadoop to [company]Hortonworks[/company] and work with that company along with [company]IBM[/company], [company]GE[/company] on  the Open Data Platform the response from Cloudera CEO Mike Olsen in a blog post with his take. 

Also on the docket, @WalmartLabs massive OpenStack production private cloud implementation.

Joesph Sirosh

Joseph Sirosh

 

SHOW NOTES

Hosts: Barb Darrow and Derrick Harris.

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed

PREVIOUS EPISODES:

No, you don’t need a ton of data to do deep learning 

VMware wants all those cloud workloads “marooned” in AWS

Don’t like your cloud vendor? Wait a second.

Hilary Mason on taking big data from theory to reality

On the importance of building privacy into apps and Reddit AMAs

Google raises alarm over global search warrants

Google used loaded language on Wednesday to warn that a Justice Department proposal will make it easier for U.S. law enforcement to reach into remote computers, and that it could upset diplomatic relations with other countries.

“[The change] could have profound implications for the privacy rights and security interests of everyone who uses the Internet,” Richard Salgado, a Legal Director for Google, wrote in a blog post.

The issue at stake is how far a search warrant authorizing a computer search should reach. Currently, federal judges are typically restricted to issuing warrants that allow a search of computers and servers located in their district.

A proposed amendment, however, would let judges give law enforcement the right to conduct remote searches. The Justice Department said in the proposal that the current system is a strain for investigators and judges in cases where criminals use proxy IP addresses, or deploy computers across multiple jurisdictions:

“Because the target of the search has deliberately disguised the location of the media or information to be searched, the amendment allows a magistrate judge in a district in which activities related to a crime may have occurred “to issue a warrant to use remote access to search electronic storage media and to seize or copy electronically stored information located within or outside that district,” said the Justice Department in a document describing the draft change (my emphasis).

Google, however, claimed that this could give U.S. enforcement extra-territorial legal power, and justify access to computers and devices worldwide. The company argued instead that the U.S. should rely on existing diplomatic conventions that call for companies to cooperate on criminal investigations.

“The U.S. has many diplomatic arrangements in place with other countries to cooperate in investigations that cross national borders, including Mutual Legal Assistance Treaties … The significant foreign relations issues associated with the proposed change to Rule 41 should be addressed by Congress and the President, not the Advisory Committee.”

The issue has also been a sore spot for Microsoft, which is running the risk of contempt of court proceedings as a result of a hardline position it is taking in a case before the Second Circuit of Appeals in New York. That case is about whether a U.S. search warrant obliges the company to hand over data stored on a server in Dublin, Ireland.

The warnings from Google and Microsoft are driven in part by self-interest. The companies are anxious to reassure overseas cloud computer customers, who are skittish about the Edward Snowden revelations, that their data will not be subject to U.S. jurisdiction.

The Justice Department, meanwhile, has suggested that imposing territorial limits on computer searches is unrealistic when anyone can store data anywhere.

The Google blog post comes at the close of a comment during which the Justice Department asked for comment on a variety of rule changes, including to “Rule 41” on search powers (the National Journal profiled Rule 41 in detail last year). A final version of the rule is still pending.

AWS maintains lead in public cloud, but Azure inches forward

Amazon Web Services continues to dominate public cloud usage across the board, but Microsoft Azure is making strides at least in business accounts, according to a new RightScale survey.

[company]Amazon[/company] cloud adoption leads the pack with 57 percent of respondents reporting use of AWS (up from 54 percent last year) while 12 percent said they run [company]Microsoft[/company] Azure Infrastructure as a Service, up 6 percent from last year’s survey.

Among business or enterprise users, though, while AWS still leads with 50 percent, up slightly from 49 percent, Azure IaaS scored 19 percent, up from 11 percent.  [company]Rackspace[/company] and [company]Google[/company] App Engine are the next most popular clouds in this category, while vCloud Air logged 7 percent adoption, down from 18 percent. (Could the rebranding of vCloud Hybrid Services to vCloud Air have been a factor here?)

The Rackspace callout is interesting since the company said Tuesday it will stop breaking out public cloud and private cloud revenue and report them together. Rackspace is now focusing on private, managed cloud, in what some say shows it is ceding public cloud to the big guys.

RightScale Enterprise Cloud 2014-2015

All of these numbers are based on RightScale’s survey (downloadable here) of 930 cloud users, 24 percent of whom are RightScale customers.

Private cloud boosters won’t like this part: The new numbers show overall adoption of private cloud pretty much holding steady compared to last year. [company]VMware[/company] vSphere virtualized environments led with 53 percent of enterprise customers who reported that they use it as a private cloud. (Another 13 percent said they use vCloud Director as cloud.) This echoes last year’s survey in which many customers equated their virtualized server rooms with private cloud.

While private cloud appears to be in a bit of a swoon, it’s no surprise that Docker usage is hot. Per the survey, that containerization technology, while relatively new, is already used by 13 percent of respondents, while more than a third of the rest (35 percent) said they are planning to implement it.

Rightscale Public Clouds 2014OpenStack showed the greatest traction this year, with 13 percent adoption, growing by three percent year over year and still garnering big interest from companies whether they use it or not. A full 30 percent of respondents said they were evaluating or interested in using OpenStack over time. Microsoft’s relatively new Azure Pack showed a respectable seven  percent usage. Azure Pack, which mirrors Microsoft’s internal Azure usage, can run in a company’s own data centers or server rooms to provide an Azure-on-Azure hybrid.

Overall, Santa Barbara, California–based RightScale concluded from its research that cloud adoption is “a given” and hybrid cloud is the preferred mode of adoption. Of course RightScale offers multi-cloud management tools so that works out nicely for them.

RightScale VP of Marketing Kim Weins was our Structure Show guest after last year’s survey and had some interesting insights that might be helpful to compare and contrast. Check out the podcast below.

[soundcloud url=”https://api.soundcloud.com/tracks/143987938?secret_token=s-6kZD6″ params=”color=ff5500&auto_play=false&hide_related=false&show_artwork=true” width=”100%” height=”166″ iframe=”true” /]

Microsoft Office can now save files to Apple’s iCloud Drive

Microsoft is really serious about enabling its services to run on all devices and work with other companies’ products.

The latest example: The Office app for iPhones and iPads can now save files to Apple’s relatively new cloud storage service, iCloud Drive, along with other cloud services providers, including Box, Google Drive, and any other service that decides to enable integration with Microsoft Office.

In the updated Microsoft Office app, the locations menu will let you open, edit, and save documents stored with the service of your choice. It’s not perfect — for instance, certain text files stored in iCloud will be read-only. Previously, the file picker in the Word, Excel, and Powerpoint apps only could display files stored in Microsoft OneDrive, and in a update last November, Dropbox.

New-Cloud-storage-integration-for-Office-1

Microsoft also announced that Office can now be integrated into other companies’ enterprise applications, such as Box, Salesforce, and Citrix.

Microsoft still has a tricky tightrope to walk between recommending and selling its own services while still giving users and business users the flexibility to use the tools they prefer. It also puts companies like Box into an odd arrangement where they are both competing with Microsoft’s products while also collaborating with the Redmond giant. Now that a lot of companies have their own sync-and-storage product, Box and Dropbox want to turn into platform companies, which runs counter to Microsoft’s aims. But for the time being, the Office app can open Box files, and later this year, Office Online will be able to be opened in the Box app.

It’s also a bit of a defensive move to claim the interoperability mantle for mobile productivity — last week Apple opened the beta of its free web-based iWork suite to Windows users, but Apple’s probably not adding OneDrive integration. Google’s web-based work suite only works with files saved on Google Drive.

The update adding support for iCloud and other cloud service providers is available from the iTunes App Store today. Microsoft says its “hard at work” adding the same features to the Office apps for Windows and Android.

Microsoft claims compliance with ISO data privacy standard

Microsoft says its compliance with a data privacy standard set by the International Organization for Standardization (ISO) means customer data in its Azure cloud will be safer from prying eyes.

The ISO/IEC 27018 standard aims to establish “a uniform, international approach to protecting privacy for personal data stored in the cloud,” Microsoft General Counsel and EVP Brad Smith wrote in a blog post.

A third-party, the British Standards Institute (BSI), has verified that Microsoft Azure as well as Office 365 and Dynamics CRM Online meet the ISO criteria, Smith noted.

Compliance means that the vendor’s customer controls her data and will know what’s happening with that data down the line. It also requires the vendor to implement strong security and restricts how data can be handled on public networks, transportable media etc. And, it means that data will not be used for advertising — which means that [company]Google[/company] is unlikely to climb aboard this particular bandwagon.

This is not an academic exercise for [company]Microsoft[/company] which is fighting U.S. court order to turn over customer data residing in its Dublin data center to U.S. authorities.

Cloud competitors are likely to call this a PR stunt — a concept that Microsoft is familiar with — but a security expert said ISO/IEC 27018 certification could become a major selling point to privacy obsessed consumers who balk at the notion that Google, because of its advertising business, uses customer data to sell stuff.

Said this expert, who requested anonymity because he works with both Google and Microsoft:  “Google would never agree to this since advertising is everything to them … Personally when I pay someone for a service, I expect my data to be private. When I use a service for free I accept that it is being paid for by sacrificing my privacy.”

For more on Microsoft’s data privacy stance, see Smith’s talk at last year’s Structure show below.

[youtube https://www.youtube.com/watch?v=6ncpPRqAJpc]