Two terms kept popping up as I watched a slew of Microsoft executives show off the company’s future at its annual TechForum media gathering last week. One was “machine learning.” The other was “Bing.”
I would have been surprised had I not sat down with Microsoft Technical Fellow Dave Campbell the night before the event to talk big data. After all, I was in Redmond — home of Word, Excel and a, shall we say, misunderstood new operating system — not Silicon Valley, where “machine learning” now rolls off the tongue as easily and often as “startup” or “triathlon.”
However, a single rhetorical question from Campbell resonated pretty loudly and got me in the right frame of mind for what I was about to hear: Who else, he asked, has a top-tier web service business (complete with the hundreds of petabytes of data those services collect) as well as a top-tier enterprise software business?
He could have added to that list a consumer software business, 30 percent of the world’s long-distance calls, a mobile device business, one of the world’s most popular gaming platforms, a large-screen touch-display business, and a motion-sensing device that ties into — and can control — all of them. They all came into play at TechForum, as various company presidents, engineers and now-adviser-to-the-CEO Craig Mundie demonstrated a future where everything is connected and trying to learn what we like and what we’re doing.
Bing is the key to it all (even if it can’t touch Google)
Microsoft’s Bing search engine is at the core of everything the company is trying to do in the field of machine learning and cutting-edge big data. That fact makes it an important part of Microsoft’s future even if it never gets close to Google search in terms of revenue or users. “Its long-term value is just as much as a deep infrastructural element,” Mundie said during a Q&A session kicking off the event.
What he means is that Bing is valuable because the technology developed to power it ultimately stands to make Microsoft a lot more money in other areas. Qi Lu, Microsoft’s Online Services Division president (and an integral part of the maturation of Hadoop inside Yahoo earlier this century), describes Bing’s primary architecture as less of a traditional keyword index and more of an “information fabric.” We’re building a digital society, he explained, so there are digital entities — people, place and things — and Bing must be able to capture the rich spatial, temporal and other relationships among them.
Taking that vision company-wide, Microsoft can take in data from Bing, Skype, Xbox Live, Office 365 and other sources and actually be able to store, process and analyze it in a meaningful ways. Internally, this might be for business-intelligence or product-development purposes. Externally, Microsoft might use data to create experiences that span devices and services.
Bing also feeds the pipeline for future enterprise IT products, particularly when it comes to data management. Campbell tells the story of meeting a colleague years after he left the SQL database team and went to work on Bing’s infrastructure. At that point, their worlds were vastly different, but the advent of and hype around big data has converged them once again.
During his presentation, Satya Nadella, Microsoft’s Server and Tools Business president, said the company now builds internal IT with a design-for-first-party-but-think-of-third-party mentality. As a result, the core of the Windows Azure cloud-computing platform is based on technologies developed to run Bing, as is the Windows Azure storage service. When Microsoft builds a new operating system, he added, it thinks about the project at webscale in terms of what it would take to run Bing using that platform.
And Campbell told me via email after the event that Microsoft is considering how to productize the various graph, NoSQL and other types of databases it uses to power the features within Bing. Ironically, though, its Cosmos and Dryad technologies that serve as the core of Bing are off the table: consumers demanded Hadoop, so that’s what Microsoft is currently pushing for mass storage and large-scale batch processing.
Google, of course, is doing something very similar, albeit with less of a focus on enterprise software as a final destination for its technologies (with the exception of its small suite of cloud services such as Compute Engine, App Engine and BigQuery). Rather, the types of advances in data storage, processing and analysis that Google has made thanks to products such as search and YouTube are finding their way into Project Glass and self-driving cars. Time will tell whose efforts prove wiser in the end.
A little history and prognostication on machine learning
Mundie said machine learning, especially, has been a core part of Microsoft Research’s focus for years. And although there were some initial struggles, including a dearth of good data and machines powerful enough to process it all, the company and the industry as a whole have come a long way. Among the big areas of improvement he cited were real-time speech recognition — Microsoft has done some impressive work in this area, actually — and natural user interaction.
“We’ve talked for a long time in the industry about IT meaning information technology,” Mundie said, “… you might redefine IT to be intelligent technology.”
Eric Rudder, Mundie’s protégé and chief technical strategy officer, elaborated. If you think about all the pictures and other info Microsoft’s devices and services capture, he said, you’ll see a lot of opportunity to learn and build better products. Stepping out of the consumer world, he questioned how one might begin working with a 40-billion-row Excel spreadsheet. Query it, talk to it or somehow use gestures to communicate with it?
Mundie thinks Microsoft can answer these and other questions — this despite a relative lack of attention compared with Google’s research efforts and a consumer community he says is “jaded” by the omnipresence of high technology. TV makers are copying Kinect, speech will be the most-prevalent user interaction and cameras as inputs are coming soon, he said. And Microsoft’s machine-learning research will let it capitalize or even lead the way on these movements, he added.
As I’ll highlight in a follow-up post, Microsoft showed off a lot of these capabilities to the handful of journalists invited to TechForum. Kinect, Office, Xbox Live — they’re all watching, listening, learning and working together.
It’s part of a greater transition away from “specialized gadgets” that process information and into a world full of generally intelligent devices and services that just let people get stuff done. “The vast majority of humankind,” Mundie said, “doesn’t really care about the computer, per se.”
Have research division, will persevere
In the end, Microsoft Chief Research Officer Rick Rashid expects Microsoft’s heavy investment into general research of the kind his team does will help it get the last laugh over some of its competitors. He wonders whether companies like Apple — which already saved itself once — will be ready to ride the next wave of innovation or the one after that without dedicated general research departments that aren’t necessarily tied to product development. His view is that you can only buy yourself into the next generation so many times.
It was Microsoft Research, for example, that developed a method for compressing 32-bit code in the early 1990s — something that would prove fortuitous when it came time to ship Windows ’95 and its associated applications despite the fact that most PCs lacked the proper hardware for the 32-bit OS. In terms of establishing the dominance of Office over its peers that had to wait until the hardware caught up, Rashid told a group of reporters during the event, “that was game over.”
“Our industry is littered with companies that aren’t here anymore,” he added.
Touché. Microsoft is the butt of a lot of jokes, but as the tech world shifts toward intelligent devices and alternative mode of human-computer interaction, the company’s research into areas such as big data and machine learning suggest it will still be very much around for some time to come.
To learn a lot more about machine learning and the latest trends in big data technologies, be sure to attend our Structure: Data conference March 20-21 in New York. Speakers will include some of the brightest minds in data from organizations such as EMC, Facebook, Cloudera, Quid and even the CIA.