Here are the winners of the 2015 Structure Data Awards

The second-annual Structure Data Awards are here, where Gigaom picks the most-interesting and most-promising data startups that launched in the previous year. The winners, which range from a non-profit data science organization to a company building infrastructure for deep learning, will present during a special session at our Structure Data conference, which takes places March 18 and 19 in New York.

This year’s winners are:

Bayes ImpactA non-profit organization that emerged from Y Combinator, Bayes Impact is trying to bring data to bear on some of society’s thorniest problems. It host fellows, works with directly other non-profit organizations, and puts on hackathons to identify new applications for data science.

ConfluentApache Kafka has become a popular tool for managing real-time data streams of data from web sites, applications and sensors. In March, the team that created Kafka while at LinkedIn launched Confluent to help commercialize the technology.

EnliticDeep learning has proven its prowess in pattern recognition and computer vision, although advances often emerge from the corporate labs of large web companies. Enlitic is applying the techniques in the name of health care by trying to build deep learning models that can diagnose disease from medical images.

InteranaBased on the data-centric culture two of its founders experienced while working and building data products at Facebook, Interana’s software is designed to open data analysis to entire companies. Aside from the user experience, the team built an entire data storage and low-latency processing stack from the ground up.

David Soloff of Premise Data (left) and his Structure Data award in 2014.

David Soloff of Premise Data (left) and his Structure Data award in 2014.

MetaMindMetaMind is the product of years of artificial intelligence research by its founding team, including in the field of deep learning. The company’s goal is to help other organizations make the most of their text and image data, and to push the state of the art with its own research.

Nervana SystemsAs deep learning took off, the team at Nervana sensed an opportunity to build systems specially designed for the unique computing requirements of neural networks. Although the company includes folks who have worked on neurosynaptic chips at places such as Qualcomm, Nervana is building a hardware-and-software platform.

TamrBig data is taking off within enterprises, but finding and transforming relevant datasets is still very difficult. Tamr, which was founded by former Vertica CEO Andy Palmer and database expert Michael Stonebraker, tries to simplify it with a combination of machine learning and human data stewards.

TeraDeepTeraDeep is straddling the intersection of two very big trends — deep learning and the internet of things. The company has developed deep learning algorithms that can run on smartphone processors and FPGAs, and is building small processors that can be embedded into devices to make them intelligent.

Of course, startups are just a part of Structure Data 2015. The event also features executives from the biggest, best and most innovative companies around — including BuzzFeed, ESPN, Google and NASA — and researchers from universities and companies including Facebook, MIT, NYU and Stanford.

Facebook open sources tools for bigger, faster deep learning models

Facebook on Friday open sourced a handful of software libraries that it claims will help users build bigger, faster deep learning models than existing tools allow.

The libraries, which [company]Facebook[/company] is calling modules, are alternatives for the default ones in a popular machine learning development environment called Torch, and are optimized to run on [company]Nvidia[/company] graphics processing units. Among the modules are those designed to rapidly speed up training for large computer vision systems (nearly 24 times, in some cases), to train systems on potentially millions of different classes (e.g., predicting whether a word will appear across a large number of documents, or whether a picture was taken in any city anywhere), and an optimized method for building language models and word embeddings (e.g., knowing how different words are related to each other).

“‘[T]here is no way you can use anything existing” to achieve some of these results, said Soumith Chintala, an engineer with Facebook Artificial Intelligence Research.

That team was formed in December 2013 when Facebook hired prominent New York University researcher Yann LeCun to run it. Rob Fergus, one of LeCun’s NYU colleagues who also joined Facebook at the same time, will be speaking on March 19 at our Structure Data conference in New York.

A heatmap showing performance of Facebook's modules to standard ones on datasets of various sizes. The darker the green, the faster Facebook was.

A heatmap showing performance of Facebook’s modules to standard ones on datasets of various sizes. The darker the green, the faster Facebook was.

Despite the sometimes significant improvements in speed and scale, however, the new Facebook modules probably are “not going to be super impactful in terms of today’s use cases,” Chintala said. While they might produce noticeable improvements within most companies’ or research teams’ deep learning environments, he explained, they’ll really make a difference (and justify making the switch) when more folks are working on stuff at a scale like Facebook is now — “using models that people [previously] thought were not possible.”

Perhaps the bigger and more important picture now, then, is that Friday’s open source releases represent the start of a broader Facebook effort to open up its deep learning research the way it has opened up its work on webscale software and data centers. “We are actually going to start building things in the open,” Chintala said, releasing a steady stream of code instead of just the occasional big breakthrough.

Facebook is also working fairly closely with Nvidia to rework some of its deep learning programming libraries to work at web scale, he added. Although it’s working at a scale beyond many mainstream deep learning efforts and its researchers change directions faster than would be feasible for a commercial vendor, Facebook’s advances could find their way into future releases of Nvidia’s libraries.

Given the excitement around deep learning right now — for everything from photo albums to self-driving cars — it’s a big deal that more and better open source code is becoming available. Facebook joins projects such as Torch (which it uses), Caffe and the Deeplearning4j framework being pushed by startup Skymind. Google has also been active in releasing certain tooling and datasets ideal for training models.

It was open source software that helped make general big data platforms, using software such as Hadoop and Kafka, a reality outside of cutting-edge web companies. Open source might help the same thing happen with deep learning, too — scaling it beyond the advances of leading labs at Facebook, Google, Baidu and Microsoft.

Baidu built a supercomputer for deep learning

Chinese search engine company Baidu says it has built the world’s most-accurate computer vision system, dubbed Deep Image, which runs on a supercomputer optimized for deep learning algorithms. Baidu claims a 5.98 percent error rate on the ImageNet object classification benchmark; a team from Google won the 2014 ImageNet competition with a 6.66 percent error rate.

In experiments, humans achieved an estimated error rate of 5.1 percent on the ImageNet dataset.

The star of Deep Image is almost certainly the supercomputer, called Minwa, which Baidu built to house the system. Deep learning researchers have long (well, for the past few years) used GPUs in order to handle the computational intensity of training their models. In fact, the Deep Image research paper cites a study showing that 12 GPUs in a 3-machine cluster can rival the performance of the performance of the 1,000-node CPU cluster behind the famous Google Brain project, on which Baidu Chief Scientist Andrew Ng worked.


But no one has yet built a system like this dedicated to the task of computer vision using deep learning. Here’s how paper author Ren Wu, a distinguished scientist at the Baidu Institute of Deep Learning, describes its specifications:

[blockquote person=”” attribution=””]It is comprised of 36 server nodes, each with 2 six-core Intel Xeon E5-2620 processors. Each sever contains 4 Nvidia Tesla K40m GPUs and one FDR InfiniBand (56Gb/s) which is a high-performance low-latency interconnection and supports RDMA. The peak single precision floating point performance of each GPU is 4.29TFlops and each GPU has 12GB of memory.

… In total, Minwa has 6.9TB host memory, 1.7TB device memory, and about 0.6 [petaflops] theoretical single precision peak performance.[/blockquote]

Sheer performance aside, Baidu built Minwa to help overcome problems associated with the types of algorithms on which Deep Image was trained. “Given the properties of stochastic gradient decent algorithms, it is desired to have very high bandwidth and ultra low latency interconnects to minimize the communication costs, which is needed for the distributed version of the algorithm,” the authors wrote.

A sample of the effects Baidu used to augment images.

A sample of the effects Baidu used to augment images.

Having such a powerful system also allowed the researchers to work with different, and arguably better, training data than most other deep learning projects. Rather than using the 256 x 256-pixel images commonly used, Baidu used higher-resolution images (512 x 512 pixels) and augmented them with various effects such as color-casting, vignetting and lens distortion. The goal was to let the system take in more features of smaller objects and to learn what objects look like without being thrown off by editing choices, lighting situations or other extraneous factors.

Baidu is investing heavily in deep learning, and Deep Image follows up a speech-recognition system called Deep Speech that the company made public in December. As executives there have noted before, including Ng at our recent Future of AI event in September, the company already sees a relatively high percentage of voice and image searches and expects that number to increase. The better its products can perform with real-world data (research datasets tend to be fairly optimal), the better the user experience will be.


However, Baidu do is far from the only company — especially on the web — investing significant resources into deep learning and getting impressive results. Google, which still holds the ImageNet record in the actual competition, is probably the company most associated with deep learning and this week unveiled new Google Translate features that likely utilize the technology. Microsoft and Facebook also have very well-respected deep learning researchers and continue to do cutting-edge research in the space while releasing products that use that research.

Yahoo, Twitter, Dropbox and other companies also have deep learning and computer vision teams in place.

Our Structure Data conference, which takes place in March, will include deep learning and machine learning experts from many organizations, including Facebook, Yahoo, NASA, IBM, Enlitic and Spotify.


Artificial intelligence is real now and it’s just getting started

Artificial intelligence is already very real. Not conscious machines, omnipotent machines or even reasoning machines (yet), but statistical machines that automate and increasingly can outperform humans at certain pattern-recognition tasks. Computer vision, language understanding, anomaly detection and other fields have made immense advances in the past few years.

All this work will be the stepping stones for future AI systems that, decades from now, might perform feats we’ve only imagined computers could perform. There are brain-inspired neurosynaptic microchips under development, and quantum artificial intelligence might only be a decade away. Some experts predict general artificial intelligence — perhaps even artificial superintelligence — will happen easily within this century.

The effects that AI is having and will have on business, the economy and, most importantly, humanity certainly merit consideration. It’s a discussion we’ll have at our Structure Data conference, which takes place March 18 and 19 in New York, and includes some of the top AI and machine learning researchers, practitioners and thinkers. Confirmed speakers so far include:

  • Ron Brachman, Yahoo
  • Rob Fergus, Facebook Artificial Intelligence Research
  • Ahna Girshick, Enlitic
  • Jeff Hawkins, Numenta
  • Steven Horng, Beth Israel Deaconess Medical Center, Harvard Medical School
  • Anthony Lewis, Qualcomm
  • Gary Marcus, New York University
  • Dharmendra Modha, IBM Research
  • Naveen Rao, Nervana Systems
  • Ashutosh Saxena, Stanford University
  • Julie Shah, Massachussetts Institute of Technology
  • Davide Venturelli, NASA Quantum Artificial Intelligence Laboratory
  • Brian Whitman, Spotify
  • Dan Zigmond, Hampton Creek

We plan to confirm additional speakers in the next couple weeks. We’ll delve even further into the topic of AI in all its facets at the inaugural Structure Intelligence conference. That’s currently scheduled to take place in San Francisco in September, and more details will be available soon.

Rob Bearden, CEO, Hortonworks Structure Data 2014

Hortonworks CEO Rob Bearden (who’ll be back this year) at Structure Data 2014.

Of course, Structure Data is about more than just AI — it’s about managing, analyzing and making use of all sorts of data for all sorts of applications. In addition to the lineup of speakers we announced in December (from companies including BuzzFeed, Twitter, Goldman Sachs, Tableau and all three major Hadoop vendors, to name several), other newly confirmed speakers include:

  • Eric Brewer, vice president of infrastructure, Google
  • Julie Brill, commissioner, Federal Trade Commission
  • Krish Dasgupta, vice president of data platforms and technology, ESPN
  • Victor Nilson, vice president of big data, AT&T
  • Bill Ruh, vice president of global software center, GE Global Research
  • Bill Squadron, executive vice president of Pro Analytics, STATS

The data revolution is at an inflection point and looks set to shoot upward in the next few years. Come hear some of the smartest people in the space discuss what it will mean across industries and societies.

Machine learning will eventually solve your JPEG problem

I take a lot of photos on my smartphone. So many, in fact, that my wife calls me Cellphone Ansel Adams. I can’t imagine how many more digital photos we’d have cluttering up our hard drives and cloud drives if I ever learned how to really use the DSLR.

So I get excited when I read and write about all the advances in computer vision, whether they’re the result of deep learning or some other technique, and all the photo-related acquisitions in that space (Google, Yahoo, Pinterest, Dropbox and Twitter have all bought computer vision startups). I’m well aware there are much wider-ranging and important implications, from better image-search online to disease detection — and we’ll discuss them all at our Structure Data conference in March — but I personally love being able to search through my photos by keyword even though I haven’t tagged them (we’ll probably discuss that at Structure Data, too).

A sample of the results when I search my Google+ photos for "lake."

A sample of the results when I search my Google+ photos for “lake.”

I love that Google+ can detect a good photo, or series of photos, and then spice it up with some Auto-Awesome.

IMG_20131226_121710-SNOW (1)

Depending on the service you use to manage photos, there has never been a better time to take too many of them.

If there’s one area that has lagged, though, it’s the creation of curated photo albums. Sometimes Google makes them for me and, although I like it in theory (especially for sharing an experience in a neatly packaged way), they’re usually not that good. It will be an album titled “Trip to New York and Jersey City,” for example, and will indeed include a handful of photos I took in New York, just usually not the ones I would have selected.

Although I’m not about to go through my thousands of photos (or even dozens of photos the day after a trip) and create albums, I’ll gladly let a service to do it for me. But it’s only if the albums are good that I’ll do something beyond glance at them. Usually, I love getting the alert that an album is ready, and then get over the excitement really quickly.

So I was interested to read a new study by Disney Research discussing how its researchers have developed an algorithm creates photo albums based on more factors than just time and geography, or even whether photos are “good.” The full paper goes into a lot more detail about how they trained the system (sorry, no deep learning) but this description from a press release about it sums up the results nicely:

[blockquote person=”” attribution=””]To create a computerized system capable of creating a compelling visual story, the researchers built a model that could create albums based on variety of photo features, including the presence or absence of faces and their spatial layout; overall scene textures and colors; and the esthetic quality of each image.

Their model also incorporated learned rules for how albums are assembled, such as preferences for certain types of photos to be placed at the beginning, in the middle and at the end of albums. An album about a Disney World visit, for instance, might begin with a family photo in front of Cinderella’s castle or with Mickey Mouse. Photos in the middle might pair a wide shot with a close-up, or vice versa. Exclusionary rules, such as avoiding the use the same type of photo more than once, were also learned and incorporated.[/blockquote]


It’s just research and surely isn’t perfect, but it feels like a step in the right direction. It could make sharing photos so much easier and more enjoyable for everyone involved. There’s no doubt the folks at Google, Yahoo and elsewhere are already working on similar things so they can roll them out across services such as Flickr and Google+.

Remember physical slide shows with projectors? The same rules still apply: Your aunt and your friends don’t want to skip through 5 pictures of your finger over the lens, marvel at the beauty of the same rock formation shot from 23 slightly different angles, or laugh at that at that sign that you had to be there to get why it’s funny. They want a handful of pictures of you looking nice in front of famous landmarks or pretty sunsets. Probably on their phone while waiting in line at the checkout.

I don’t always have the self-control or editorial sense to deliver that experience. I’ll be happy if an algorithm can do it for me.

The 5 stories that defined the big data market in 2014

There is no other way to put it: 2014 was a huge year for the big data market. It seems years of talk about what’s possible are finally giving way to some real action on the technology front — and there’s a wave of cash following close behind it.

Here are the five stories from the past year that were meaningful in their own rights, but really set the stage for bigger things to come. We’ll discuss many of these topics in depth at our Structure Data conference in March, but until then feel free to let me know in the comments what I missed, where I went wrong or why I’m right.

5. Satya Nadella takes the reins at Microsoft

Microsoft CEO Satya Nadella has long understood the importance of data to the company’s long-term survival, and his ascendance to the top spot ensures Microsoft won’t lose sight of that. Since Nadella was appointed CEO in February, we’ve already seen Microsoft embrace the internet of things, and roll out new data-centric products such as Cortana, Skype Translate and Azure Machine Learning. Microsoft has been a major player in nearly every facet of IT for decades and how it executes in today’s data-driven world might dictate how long it remains in the game.

Microsoft CEO Satya Nadella speaks at a Microsoft Cloud event. Photo by Jonathan Vanian/Gigaom

Satya Nadella speaks at a Microsoft Cloud event.

4. Apache Spark goes legit

It was inevitable that the Spark data-processing framework would become a top-level project within the Apache Software Foundation, but the formal designation felt like an official passing-of-the-torch nonetheless. Spark promises to do for the Hadoop ecosystem all the things MapReduce never could around speed and usability, so it’s no wonder Hadoop vendors, open source projects and even some forward-thinking startups are all betting big on the technology. Databricks, the first startup trying to commercialize Spark, has benefited from this momentum, as well.

Ion Stoica

Spark co-creator and Databricks CEO Ion Stoica.

3. IBM bets its future on Watson

Big Blue might have abandoned its server and microprocessor businesses, but IBM is doubling down on cognitive computing and expects its new Watson division to grow into a $10 billion business. The company hasn’t wasted any time trying to get the technology into users’ hands — it has since announced numerous research and commercial collaborations, highlighted applications built atop Watson and even worked Watson tech into the IBM cloud platform and a user-friendly analytics service. IBM’s experiences with Watson won’t only affect its bottom line; they could be a strong indicator of how enterprises will ultimately use artificial intelligence software.

watson headquarters

A shot of IBM’s new Watson division headquarters in Manhattan.

2. Google buys DeepMind

It’s hard to find a more exciting technology field than artificial intelligence right now, and deep learning is the force behind a lot of that excitement. Although there were a myriad of acquisitions, startup launches and research breakthroughs in 2014, it was Google’s acquisition of London-based startup DeepMind in January that set the tone for the year. The price tag, rumored to be anywhere from $450 million to $628 million, got the mainstream technology media paying attention, and it also let deep learning believers (including those at competing companies) know just how important deep learning is to Google.

Jeffrey Dean - Google Fellow, Google

Google’s Jeff Dean talks about early deep learning results at Structure 2013.

1. Hortonworks goes public

Cloudera’s massive (and somewhat convoluted) deal with Intel boosted the company’s valuation past $4 billion and sent industry-watchers atwitter, but the Hortonworks IPO in December was really a game-changer. It came faster than most people expected, was more successful than many people expected, and should put the pressure on rivals Cloudera and MapR to act in 2015. With a billion-plus-dollar market cap and public market trust, Hortonworks can afford to scale its business and technology — and maybe even steal some valuable mindshare — as the three companies vie to own what could be a humongous software market in a few years’ time.


Hortonworks rings the opening bell on its IPO day.

Honorable mentions

The year in tech: Net neutrality, IoT grows up, Uber turns heads

As 2014 draws to a close, the tech world seems a little weary. It was a draining year if you were plugged into social media, with conflicts at home and overseas juxtaposed against the soaring wealth of the San Francisco Bay Area, home to an industry that has become one of the dominant forces in the world. As we inch closer to what will likely be the top of the Third Tech Boom-Bust Cycle since the web changed the world, technology has never been more present in our day-to-day lives, for better or worse.

But for all the conflict that marked the year in tech — a blatant power grab by the company that was actually voted “Worst Company in America,” the uneasiness that FCC Commissioner Tom Wheeler might finally reward his old buddies in the cable industry with favorable internet regulation, and a series of public-relations disasters by Uber that left a black mark on the next dominant tech company — there were plenty of bright spots, especially among the areas that Gigaom follows closely.

Big Data has turned into big money and the rise of deep learning and artificial intelligence could transform computing. The cloud is the norm, and the largest companies in tech are going all-in on cloud computing as new startups promise to make complex app development even simpler. The internet of things, a concept we have evangelized for years, went from buzzword-just-around-the-corner to the cornerstone of planning from tech companies big and small heading into 2015.

The king of the hill — Apple — unveiled what could be its next-generation product category breakthrough amid the growing popularity of wearable computers. Microsoft showed that it is at last ready to enter the mobile computing era with the refreshing emergence of Satya Nadella as its third-ever CEO. And Tesla proved that the electric car is alive and well, and just getting started as the vehicle of the 21st century.

I asked our writers to pick the most important, most notable, and most influential developments on their beats in 2014, and here’s what they came up with. We’re looking forward to the holiday break as much as the rest of you are, because 2015 promises to be a landmark year for the tech industry.

Thanks for reading Gigaom, and Happy Holidays.
— Tom Krazit, Executive Editor


Net neutrality twists and turns

Demonstrators protest outside the FCC as the commission is about to meet to receive public comment on proposed open Internet notice of proposed rulemaking and spectrum auctions May 15, 2014 at the FCC headquarters in Washington, DC.

Demonstrators protest outside the FCC as the commission is about to meet to receive public comment on proposed open Internet notice of proposed rulemaking and spectrum auctions May 15 at the FCC headquarters in Washington, DC.

Two events this summer suggested that things were about to get much, much worse for American internet users: the FCC signaled strongly that it would give the go-ahead to ISPs to create “fast lanes” for favored websites, and the business press predicted that telecom giant [company]Comcast[/company] faced smooth sailing in its quest to swallow its largest rival, [company]Time Warner Cable[/company].

Then something changed. A popular backlash, egged on by the likes of comedian John Oliver and fanned by four million internet comments, caused the political winds to shift. All of a sudden, President Barack Obama called to implement real “Title II” net neutrality, and the FCC abruptly cooled on both the fast lanes and the Comcast merger. Going into 2015, consumers face an unexpectedly positive outlook for faster internet and real broadband competition.
— Jeff John Roberts, Senior Writer


Surveillance fight ramps up

Google Surveillance 2
When it comes to online surveillance, the most significant developments of the year were probably the striking-down by Europe’s top court of the E.U. Data Retention Directive, and the aftermath of that decision. The court said the directive, which forced telecom providers to store metadata about their users’ communications for surveillance purposes, was incompatible with the rights to privacy and the protection of personal data. The U.K. responded with a barely debated “emergency law” (months after the court’s decision) that not only made it possible for the British surveillance regime to continue, but expanded it to take in communication over social networks and more. Sweden, too, doubled down on data retention, leading one ISP there to offer free VPN access to its customers. Australia is now also introducing data retention. Meanwhile, the United Nations has begun condemning the practice on human rights grounds.
— David Meyer, Senior Writer


The resurgence of T-Mobile

The Paramount Theater in Seattle played host to T-Mobile's Uncarrier 5.0 event in June.

The Paramount Theater in Seattle played host to T-Mobile’s Uncarrier 5.0 event in June.

If there were a telecom horoscope, 2014 would be the year of [company]T-Mobile[/company]. The carrier was the object of other carriers’ desires — with Sprint and Softbank as well as French ISP Iliad angling to buy the company — and it became a symbol for competition in the U.S., with regulators making it clear that they don’t want to see fewer than four nationwide operators.

T-Mobile, however, didn’t just sit idly while the industry fought over its future. It became a competitive force in its own right, as well a thorn in the sides of [company]AT&T[/company] and [company]Verizon[/company]. It launched several unique initiatives, such as an iPhone loaner program. T-Mo even killed — or at least maimed — one of the mobile industry’s sacred cows, announcing that in January 2015 it will start allowing customers to hold onto their unused data each month.

Those moves helped T-Mobile bring in 6.2 million new connections in the first nine months of the year, giving it a total of 52.9 million subscribers and putting it within spitting distance of overtaking Sprint. But most significantly, T-Mobile is changing the mobile industry as a whole. Two years ago, there was no difference between a two-year contract and a postpaid smartphone plan. But thanks to T-Mobile, all four major carriers are retreating from contracts and subsidies and charging customers lower rates because of it.
— Kevin Fitchard, Senior Writer

Apple’s wild year

Apple CEO Tim Cook announces the Apple Watch during an Apple special event.

Apple CEO Tim Cook announces the Apple Watch during an Apple special event.

[company]Apple[/company] is still the world’s most valuable company by market cap, and although it doesn’t sell the most smartphones, it makes the most money. But while Apple is a formidable force, cracks are starting to show. iPad sales are actually decreasing, Apple’s cloud services are still a mess — as evidenced by the embarrassing iCloud hacks — and Android devices just keep getting cheaper and better. But the biggest Apple story this year is actually next year’s story: In September, Apple revealed its vision of wearable computing in the form of the Apple Watch. We still don’t know exactly what it does or how it does it, but one thing’s for sure: it’s going to be a big story in 2015.
— Kif Leswing, Staff Writer

Wearable devices may go mainstream

Android Wear time dim

After hearing for some time how big the wearable device market will eventually be, 2014 gave us reasons to finally believe the future forecasts: The first signs of serious mainstream adoption emerged this year.

Google launched its Android Wear software platform in June and now there are a half-dozen watches that run on it, with more to come. A peek at the Google Play Store shows that the required Android Wear app for these watches has between 500,000 and a million downloads. Pebble sold 450,000 watches by mid-year and continued to improve its product with Android Wear notifications.

Health tracking devices fitting most budgets have also appeared en masse in retail stores. You can spend $450 for the traditional timepiece looking Withings Activité, pay $12 for a Pivotal Tracker or choose from other options between both prices. Even Apple is getting in on the new market, announcing its Apple Watch in September with a starting price of $350 when it arrives in early 2015.
— Kevin Tofel, Senior Writer

Nadella puts his stamp on Microsoft

Satya Nadella microsoft employees
The slow-motion departure of Steve Ballmer as Microsoft CEO was pre-announced in August 2013, but the angst-filled search for his successor lasted into 2014. Yup, it took Microsoft’s board six months to find the next CEO right on its own Redmond, Washington campus. Satya Nadella was named the company’s third-ever CEO in February 2014.

Nadella didn’t take long to make his presence felt. In March, he hosted the public debut of Microsoft’s Office for iPad, stressing the company’s plan to support its applications on all devices, even those that don’t run Windows, in effect decoupling Office from Windows.

Nadella also pushed the Azure cloud hard. In a nod to changing realities, the company even deleted “Windows” from its cloud branding so Windows Azure became Microsoft Azure. If anyone doubted that Microsoft Azure would compete head-on with Amazon Web Services in public cloud, they should be sure of it after Nadella’s first year at the helm.
— Barb Darrow, Senior Writer

The deep learning explosion

Google acquired the Jetpac team in August.

Google acquired the Jetpac team in August.

The groundwork was laid in 2013, but 2014 was the year of deep learning. There were acquisitions (including DeepMind, MadBits and Jetpac), startups (Skymind, Ersatz Labs, Enlitic, Butterfly Network and MetaMind), and of plenty of debate over whether deep learning is the next big thing or just a lot of buzz. Both are probably somewhat true, but what’s undeniable is the pace of change in the field now that some of the world’s largest companies are funding it — last year’s advances were quickly overshadowed, and breakthroughs came from all over the place, sometimes simultaneously.
— Derrick Harris, Senior Writer

The year of the container

Ben Golub, CEO of Docker

Ben Golub, CEO of Docker

If you’re an avid follower of cloud computing, you’ve probably heard about the container and how it can help developers craft applications more easily as well as simplify IT operations. San Francisco–based Docker has been at the forefront of popularizing the container, which is a type of virtualization technology that lets a single Linux operating system kernel run multiple applications without them impacting one another. What made Docker so popular (it landed $40 million in the fall and is supposedly valued at $400 million) is that it makes it simpler to move these virtual shells — each containing parts of an application — across multiple environments like different clouds or even bare-metal servers. While Docker got a lot of attention this year from big tech companies — Google, VMware and IBM are all supporters — one of its partners, CoreOS, recently decided to launch its own take on container technology, dubbed Rocket. Now Docker might have some major competition as CoreOS’s spin on container tech is generating buzz among cloud watchers.
— Jonathan Vanian, Staff Writer

A seismic shift

game of thrones
In the history of TV, there will be a special chapter for that fateful week in October of 2014 when the unthinkable happened: First, HBO declared that it was going to launch an online service for consumers without cable. The very next day, [company]CBS[/company] actually went ahead and launched its own online subscription service. And hours later, Univision revealed that it wants to launch such an online service as well. That week may well be the beginning of the great unbundling, or at the least the week during which TV execs admitted that they can’t keep doing business as usual in the face of a seismic shift in viewing patterns: Just weeks before HBO, Univision and CBS revealed their plans, [company]Netflix[/company] disclosed that its average subscriber already streams 90 minutes every single day.
— Janko Roettgers, Senior Writer

Uber comes of age

In this photo illustration the new smart phone taxi app 'Uber' shows how to select a pick up location next to a taxi lane on October 14 in Madrid. Spain then banned Uber in December.

In this photo illustration the new smart phone taxi app ‘Uber’ shows how to select a pick up location next to a taxi lane on October 14 in Madrid. Spain then banned Uber in December.

In 2014, Uber went big time. The ridesharing company raised an additional $2.4 billion in venture funding, shattering private company valuation records. It continued its international expansion and picked up the pace, moving into China and India, and expanding its footprint in Europe. It hired Obama’s former campaign manager, David Plouffe, to manage its own public image. Not all was hunky dory, though. Uber’s elaborate scheme to steal drivers from Lyft became public, along with its threats to dig into journalists’ personal lives. The legal battles have only increased, and Uber fielded lawsuits from cities ranging from Los Angeles to Portland. For better or worse, the company is shaping up to be the Google of this generation.
— Carmel DeAmicis, Staff Writer

First Look Media falters

Investigative reporter Glenn Greenwald speaks at a press conference after accepting the George Polk Award along side Laura Poitras, Ewan MacAskill and Barton Gellman, for National Security Reporting on April 11 n New York City.

Investigative reporter Glenn Greenwald speaks at a press conference after accepting the George Polk Award along side Laura Poitras, Ewan MacAskill and Barton Gellman, for National Security Reporting on April 11 n New York City.

This year has seen a number of fascinating media events, from BuzzFeed’s $50 million financing and Vice Media’s billion-dollar series of deals with old-media players to the recent bombshell news from Gawker founder Nick Denton that he has turned over control of his blog empire to a managing committee. But I think one of the biggest stories of the year has been the launch and subsequent stumbles of First Look Media, the new venture funded by [company]eBay[/company] billionaire Pierre Omidyar.

Although the launch of First Look was announced in late 2013, after the news was leaked to BuzzFeed, the site didn’t even have a name, and didn’t actually launch until well into 2014, with the introduction of a “magazine” called The Intercept, run by investigative blogger Glenn Greenwald.

At the time, Omidyar said The Intercept would be the first of a series of similar magazines run by different journalists, including one driven by former Rolling Stone political writer Matt Taibbi, called The Racket. Those plans soon hit a speed bump, however, as stories emerged of micromanagement by Omidyar’s executives, and Taibbi eventually left — followed by Intercept editor John Cook, who returned to Gawker after co-writing a piece about the issues at First Look.

The upheaval has led some to wonder whether the company will ever achieve the goals that Omidyar outlined when he announced that he was committing $250 million to it, and whether newer ventures such as former NPR editor Andy Carvin’s social-journalism project — called Reportedly — will be able to rely on the organization for continued support. But at least they are a sign that there is still life in Omidyar’s vision, even if it has been a bumpy ride so far.
— Mathew Ingram, Senior Writer

Ebook subscriptions take off

“Netflix for ebooks” actually started to look like a viable concept in 2014: [company]Amazon[/company] unveiled its ebook subscription service Kindle Unlimited in July. Meanwhile, startups Scribd and Oyster, which had both launched in 2013, expanded their collections in 2014, nabbing a couple big-5 publishers (HarperCollins, Simon & Schuster) that Amazon hasn’t been able to get. And Macmillan CEO John Sargent praised ebook subscription services as a potential way that publishers can challenge Amazon’s dominance in the ebook market. That means that when you use these services, you’ll actually be able to find some big books you want to read (or, in Scribd’s case, listen to — the service added audiobooks in November). It’s unlikely that all three of these services will survive, but for now they are competing against each other with various holiday deals, so it’s a good time to give one of them a try.
— Laura Owen, News Editor

Google acquires Nest and Dropcam

Nest Thermostat

I wanted to call Apple’s debut of HomeKit the most important news for the internet of things this year, but since its debut in June we haven’t seen products launch, and won’t until CES in 2015. Instead, I think the biggest news item was Google’s announced acquisition of Nest for $3.2 billion in January and subsequent acquisition of Dropcam for $550 million in June. The Nest deal put a spotlight on the market and convinced entrepreneurs, venture capitalists and big-name consumer brands that there was a there there in the smart home. From that point on, what had been a mishmash of standards and smaller products became the equivalent of big data — something that, suddenly, everyone needed a strategy for. This will mostly affect the consumer market — businesses are playing an entirely different game when it comes to the internet of things — but it had a huge impact.
— Stacey Higginbotham, Senior Writer

Tesla’s gigafactory takes shape

A recently raised spot of land in the Tahoe-Reno Industrial Center.

A recently raised spot of land in the Tahoe-Reno Industrial Center.

Tesla first started talking about the idea of building a massive battery factory at the end of 2013, but it wasn’t until 2014 that the company started to take the steps needed to make that crazy idea a reality. Is it really that crazy? Yes: Tesla’s “gigafactory,” which will produce batteries for its third car as well as for the power grid, could more than double the entire world’s lithium ion battery production.

At the start of the year Tesla raised $2 billion to help fund the factory. Later in the year, it secured Panasonic as a crucial partner. During the summer, Tesla CEO Elon Musk started playing up the search for a site — squeezing cities and states for as many incentives as he could get — and in the fall finally settled on Nevada, at a site just outside of Reno, which we scoped out. The factory deal could help transform the gambling backwater that is Reno into a high-tech manufacturing hub.
— Katie Fehrenbacher, Features Editor and Senior Writer

Virtual reality finally gets some love

An attendee wears an Oculus VR Inc. Rift Development Kit 2 headset to play a video game during the E3 Electronic Entertainment Expo in Los Angeles on  June 11.

An attendee wears an Oculus VR Inc. Rift Development Kit 2 headset to play a video game during the E3 Electronic Entertainment Expo in Los Angeles on June 11.

Am I the only one who finds it crazy that this virtual reality revival only began two years ago? That’s when Oculus launched its Kickstarter campaign, spurring an explosion of startups.

But it wasn’t until March of this year that the interest in virtual reality turned into a frenzy. That’s when Facebook acquired Oculus for $2 billion, suddenly redefining it as a field to which even the largest companies should pay attention. Samsung has since released its Gear VR, and Google has Cardboard. They won’t be the last to release headset options. Oculus’ Rift headset hasn’t even been released to consumers yet. That should give you a hint that it will continue to be a top headline for years to come.
— Signe Brewster, Staff Writer

Bitcoin goes boom and bust

Mark Karpeles (2nd R), president of MtGox bitcoin exchange speaks during a press conference in Tokyo on February 28, 2014. The troubled MtGox Bitcoin exchange filed for bankruptcy protection in Japan on February 28, with its chief executive saying it had lost nearly half a billion dollars worth of the digital currency in a possible theft.

Mark Karpeles (2nd R), president of MtGox bitcoin exchange speaks during a press conference in Tokyo on February 28, 2014. The troubled MtGox Bitcoin exchange filed for bankruptcy protection in Japan on February 28, with its chief executive saying it had lost nearly half a billion dollars worth of the digital currency in a possible theft.

Bitcoin started the year on a high — adoption was rising, and so was the price, peaking at $1,023 on January 26. Now a bitcoin is worth around $350. You can blame, in part, the long shadow that the fall and bankruptcy of the MtGox exchange in February has cast over the community. Leaked documents showed that more than 750,000 bitcoin belonging to users were lost, along with 100,000 belonging to the exchange. At the same time, Newsweek allegedly outed the founder of the cryptocurrency, a report which Dorian S. Nakamoto has vehemently denied. After its bumpy start, the price has continued declining as the blockchain, the underlying technology behind bitcoin, has started to gain traction among other industries like the internet of things. Expect to see more talk of the blockchain (and perhaps a little less bitcoin) in 2015.
— Biz Carson, Editorial Assistant

Deep learning now tackling autism and matching monkeys’ vision

Two studies published this week provide even more evidence that deep learning models are very good at computer vision and might be able to tackle some difficult problems.

The study on computer vision, out of MIT and published in PLOS Computational Biology, shows that deep learning models can be as good as certain primates when it comes to recognizing images during a brief glance. The researchers even suggest that deep learning could help scientists better understand how primate vision systems work.

Figure 4.eps

Charts showing the relative performance of primates and deep learning models.

The genetic study, performed by a team of researchers from the Canadian Institute for Advanced Research and published in Science (available for a fee, but the University of Toronto has a relatively detailed article about the research), used deep learning to analyze the “code” involved in gene splicing. Focusing on mutated gene sequences in subjects with autism, the team was able to identify 39 additional genes that might be tied to autism spectrum disorder.

By now, the capabilities of deep learning in object recognition have been well established, and there is plenty of excitement among entrepreneurs and scientists about how it could apply in medicine. But these findings suggest that excitement has substance and the techniques can make meaningful impacts in areas have little or nothing to do with the web, from where many recent advances have emerged.

AI startup Expect Labs raises $13M as voice search API takes off

There’s more to speech recognition apps than Siri, Cortana or Google voice search, and a San Francisco startup called Expect Labs aims to prove it. On Thursday, the company announced it has raised a $13 million Series A round of venture capital led by IDG Ventures and USAA, with participation from strategic investors including Samsung, Intel and Telefonica. The company has now raised $15.5 million since launching in late 2012.

Expect Labs started out by building an application called MindMeld that lets users carry on voice conversations and automatically surfaces related content from around the web as they speak. However, that was just a proving ground for what is now the company’s primary business — its MindMeld API. The company released the API in February 2014, and has since rolled out specific modules for media and ecommerce recommendations.

Here’s how the API works, as I described at its launch:

[blockquote person=”” attribution=””]The key to the MindMeld API is its ability (well, the ability of the system behind it) to account for context. The API will index and make a knowledge graph from a website, database or content collection, but then it also collects contextual clues from an application’s users about where they are, what they’re doing or what they’re typing, for example. It’s that context that lets the API decide which search results to display or content to recommend, and when.[/blockquote]

Tim Tuttle (left) at Structure Data 2014.

Tim Tuttle (left) at Structure Data 2014.

API users don’t actually have to incorporate speech recognition into their apps, and initially many didn’t, but that’s starting to change, said Expect Labs co-founder and CEO Tim Tuttle. There are about a thousand developers building on the API right now, and the vast improvements in speech recognition over the past several months alone has helped pique their interest in voice.

Around the second quarter of next year, he said, “You’re going to see some very cool, very accurate voice apps start to appear.”

He doesn’t think every application is ideal for a voice interface, but he does think it’s ideal for those situations where people need to sort through a large number of choices. “If you get voice right … it can actually be much, much faster to help users find what they need,” he explained, because it’s easier and faster to refine searches when you don’t have to think about what to type and actually type it.

A demo of MindMeld voice search, in which I learned Loren Avedon plays a kickboxer in more than one movie.

A demo of MindMeld voice search, in which I learned Loren Avedon plays a kickboxer in more than one movie.

Of course, that type of experience requires more than just speech recognition, it also requires the natural language processing and indexing capabilities that are Expect Labs’ bread and butter. Tuttle cited some big breakthroughs in those areas over the past couple of years, as well, and said one of his company’s big challenges is keeping up with those advances as they scale from words up to paragraphs of text. It needs to understand the state of the art, and also be able to hone in the sweet spot for voice interfaces that probably lies somewhere between them.

“People are still trying to figure out what the logical unit of the human brain is and replicate that,” he said.

Check out Tuttle’s session at Structure Data 2014 below. Structure Data 2015 takes place March 18-19 in New York, and covers all things data, from Hadoop to quantum computing, and from BuzzFeed to crime prediction.


Big-data-based food startup Hampton Creek raises $90M

Hampton Creek, a San Francisco startup that uses advanced data analysis to develop eco-conscious, egg-free food products, has raised a $90 million series C round of venture from a collection of big-name investors. Horizons Ventures and Khosla Ventures led the round, with participation from founder and CEO Marc Benioff, Facebook co-founder Eduardo Saverin, and DeepMind founders Mustafa Suleyman and Demis Hassabis, among others.

Hampton Creek has now raised $120 million since launching in 2011. Its products Just Mayo and Just Cookies are sold in grocery stores ranging from Walmart to Whole Foods. It previously sold a general-purpose egg substitute called Beyond Eggs.

According to co-founder and CEO Joshua Tetrick, Hampton Creek is based around a simple premise. “Ninety-nine point nine-nine percent of the food we eat is totally shitty for our bodies and for the environment,” he said, and the only way to change this is to make healthy, sustainable food that people actually want to eat.

Josh Tetrick, the CEO of Hampton Creek Foods, image courtesy of Gigaom, Katie Fehrenbacher

Joshua Tetrick

However, he added, being labeled as health food or natural food could actually be the kiss of death for what Hampton Creek is trying to accomplish. It needs to be on the shelf next to everything else if it wants consumers to really give it a chance. The hope is they’ll keep buying it because they like it, not just what it stands for.

“People are buying [unhealthy food] because it tastes good and it’s affordable,” Tetrick explained. “Those are the magnetic drivers.”

It’s a lofty goal that has required the company to invest heavily in its technology platform and machine learning team, because it will need to discover a lot of non-obvious insights about what ingredients might work in what products. Essentially, it’s looking for plants that Tetrick calls “functionally powerful” —  they can replace traditional ingredients with fewer chemicals, less fat, less sugar, a smaller carbon footprint and so on, while still tasting good and retaining, for example, a high amount of protein.

The person in charge of the company’s data science efforts is Dan Zigmond, the former chief data scientist for YouTube and [company]Google[/company] Maps. He’ll be speaking in more detail about Hampton Creek’s data-analysis techniques and vision, which entails everything from R models on single machines to deep learning models on whole clusters, at our Structure Data conference March 18–19 in New York.

HC Flywheel

So far, Hampton Creek has screened about 6,000 plants and hopes to screen hundreds of thousands, Tetrick said. When it finds good candidates — something that could act as a better binding agent or emulsifier than eggs, for example — the company starts thinking about what types of products might benefit from it. It’s already working on pastas as well as a scrambled egg substitute, but won’t bring anything to market until it tastes better, lasts longer and is more affordable than the standard options, he added.

“We consider ourselves a technology company that is doing some pioneering things in food,” Tetrick said, but he might consider reversing that characterization.

Yes, Hampton Creek will have to work like mad and perhaps develop some new techniques in order to analyze all the data it plans to, but that’s just a means to an end. Ultimately, beating food and consumer industry giants on taste, cost and distribution — or even getting them to take serious notice — will mean excelling at the business of food.