Remember when machine learning was hard? That’s about to change

A few years ago, there was a shift in the world of machine learning.

Companies, such as Skytree and Context Relevant, began popping up, promising to make it easier for companies outside of big banks and web giants to run machine learning algorithms and to do it at a scale congruent with the big data promise they were being pitched. Soon, there were many startups promising bigger, faster, easier machine learning. Machine learning became the new black as it became baked into untold software packages and services — machine learning for marketing, machine learning for security, machine learning for operations, and on and on and on.

Eventually, deep learning emerged from the shadows and became a newer, shinier version of machine learning. It, too, was very difficult and required serious expertise to do. Until it didn’t. Now, deep learning is the focus of numerous startups, all promising to make it easy for companies and developers of all stripes to deploy.

Joseph Sirosh

Joseph Sirosh

But it’s not just startups leading the charge in this democratization of data science — large IT companies are also getting in on the act. In fact, Microsoft now has a corporate vice president of machine learning. His name is Joseph Sirosh, and we spoke with him on this week’s Structure Show podcast. Here are some highlights from that interview, but it’s worth listening to the whole thing for his take on Microsoft’s latest news (including support for R and Python in its Azure ML cloud service) and competition in the cloud computing space.

You can also catch Sirosh — and lots of other machine learning and big data experts and executives — at our Structure Data conference next month in New York. We’ll be highlighting the newest techniques in taking advantage of data, and talking to the people building businesses around them and applying them to solve real-world problems.

[soundcloud url=”″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed

Why the rise of machine learning and why now

“I think the cloud has transformed [machine learning], the big data revolution has transformed it,” Sirosh said. “But at the end of the day, I think the opportunity that is available now because of the vast amount of data that is being collected from everywhere . . . is what is making machine learning even more attractive. . . . As most of behavior, in many ways, comes online on the internet, the opportunity to use the data generated on interactions on websites and software to tailor customer experiences, to provide better experiences for customers, to also generate new revenue opportunities and save money — all of those become viable and attractive.”

Asked why whether all of this is possible without the cloud, Sirosh thinks it is, but — like most things —  it would be a lot more difficult.

“The cloud makes it easy to integrate data, it makes it easy to, in place, do machine learning on top of it, and then you can publish applications on the same cloud,” he said. “And all of this process happens in one place and much faster, and that changes the game quite a bit.”

Deep learning made easy and easier

Sirosh said he began his career in neural networks and actually earned his Ph.D. studying them, so he’s happy to see deep learning emerge as a legitimately useful technology for mainstream users.

“My take on deep learning is actually this,” he explained. “It is a continuing evolution in that field, we just have now gotten to the level where we have identified great algorithmic tricks that allow you to take performance and accuracy to the next level.”

Deep learning is also an area where Microsoft sees a big opportunity to bring its expertise in building easily consumable applications to bear. Azure ML already makes it relatively easy to train deep neural networks using the same types of methods as its researchers do, Sirosh noted, but users can expect even more in the months to come.

“We will also provide fully trained neural networks,” he said. “We have a tremendous amount of data in images and text data and so on inside of Bing. We will use our massive compute power to learn predictive models from this data and offer some of those pre-trained, canned neural networks in the future in the product so that people will find it very easy to use.”

A set of images that the Microsoft system classified correctly.

The results of a Microsoft computer vision algorithm it says can outperform humans at some tasks.

How easy can all of this really be?

As long as there are applications that can hide its complexity, Sirosh has a vision for machine learning that’s much broader than even the world of enterprise IT sales.

“Well, we are actually going after a broad audience with something like machine learning,” he said. “We want to make it as simple as possible, even for students in a high school or in college. In my way of thinking about it, if you’re doing statistics in high school, you should be able to use [a] machine learning tool, run R code and statistical analysis on it. And you can teach machine learning and statistical analysis using this tool if you so choose to.”

Is Microsoft evolving from an operating system company to a data company?

Not entirely, but Sirosh did suggest that Microsoft sees a shift happening in the IT world and is moving fast to ride the wave.

“I think you should even first ask, ‘How big is the world of data to computing itself?'” he said. “I would say that in the future, a huge part of the value being generated in the field of computing . . . is going to come from data, as opposed to storage and operating systems and basic infrastructure. It’s the data that is most valuable. And if that is where in the computing industry most of the value is going to be generated, well that is one place where Microsoft will generate a lot of its value, as well.”

Why deep learning is at least inspired by biology, if not the brain

As deep learning continues gathering steam among researchers, entrepreneurs and the press, there’s a loud-and-getting-louder debate about whether its algorithms actually operate like the human brain does.

The comparison might not make much of a difference to developers who just want to build applications that can identify objects or predict the next word you’ll text, but it does make a difference. Researchers leery of another “AI winter” or trying to refute worries of a forthcoming artificial superintelligence worry that the brain analogy is setting people up for disappointment, if not undue stress. When people hear “brain,” they think about machines that can think like us.

On this week’s Structure Show podcast, we dove into the issue with Ahna Girschick, an accomplished neuroscientist, visual artist and senior data scientist at deep learning startup Enlitic. Girschick’s colleague, Enlitic Founder and CEO (and former Kaggle chief scientist) Jeremy Howard, also joined us for what turned out to be a rather insightful discussion.

[soundcloud url=”″ params=”secret_token=s-lutIw&color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed

Below are some of the highlights, focused on Girshick and Howard view the brain analogy. (They take a different tack than Google researcher Greg Corrado, who recently called the analogy “officially overhyped.”). But we also talk at length about deep learning, in general, and how Enlitic is using it to analyze medical images and hopefully help overcome a global shortage of doctors.

If you’re interested in hearing more from Girshick, Enlitic and deep learning, come to our Structure Data conference next month, where she’ll be accepting a startup award and joining me on stage for an in-depth talk about how artificial intelligence can improve the health care system. If you want two full days of all AI, all the time, start making plans for our Structure Intelligence conference in September.

Ahna Girshick, Enlitic's senior data scientist.

Ahna Girshick

Natural patterns at work in deep learning systems

“It’s true, deep learning was inspired by how the human brain works,” Girshick said on the Structure Show, “but it’s definitely very different.”

Just like with our vision systems, deep learning systems for computer vision process stuff in layers, if you will. They start with edges and then get more abstract with each layer, focusing on faces or perhaps whole objects, she explained. “That said, our brain has many different types of neurons,” she added. “Everywhere we look in the brain we see diversity. In these artificial networks, every node is trying to basically do the same thing.”

This is why our brains are able to navigate a dynamic world and do many things, while deep learning systems are usually focused on one task with a clear objective. Still, Girshick said, “From a computer vision standpoint, you can learn so much by looking at the brain that why not.”

She explained some of these connections by discussing a research project she worked on at NYU:

“We were interested in, kind of, the statistics of the of the world around us, the visual world around us. And what that means is basically the patterns in the visual world around us. If you were to take a bunch of photos of the world and run some statistics on them, you’ll find some patterns — things like more horizontals than verticals. . . . And then we look inside the brain and we see,  ‘Gee, wow, there’s all these neurons that are sensitive to edges and there’s more of them that are sensitive to horizontals than verticals!’ And then we measured . . . the behavioral response in a type of psychology experiment and we see, ‘Gee, people are biased to perceive things as more horizontal or more vertical than they actually are!'”

Asked if computer vision has been such a big focus of deep learning research so far because of those biological parallels, or because that’s companies such as Google and Facebook have the most need for, Girshick suggested it’s a bit of both. “It’s the same in the neuroscience department at a university,” she said. “The reason that people focus on vision is because a third of our cortex is devoted to vision — it’s a major chunk of our brain. . . . It’s also something that’s easier for us to think about, because we see it.”

Structure Data 2012: Ryan Kim – Staff Writer, GigaOM, Eric Huls – VP, Allstate Insurance Company, Jeremy Howard – President and Chief Scientist, Kaggle

Jeremy Howard (left) at Structure: Data 2012.

Howard noted that the team at Enlitic keeps finding more connections between Girshick’s research and the cutting edge of deep learning, and suggested that attempts to distance the two fields are sometimes insincere. “I think it’s kind of fashionable for people to say how deep learning is just math and these people who are saying ‘brain-like’ are crazy, but the truth is … it absolutely is inspired by the brain,” he said. “It’s a massive simplification, but we keep on finding more and more inspirations.”

The issue probably won’t be resolved any time soon — in part because it’s so easy for journalists and others to take the easy way out when explaining deep learning — but Girshick offered a solution.

“Maybe they should say ‘inspired by biology’ instead of ‘inspired by the brain,'” she said. “. . . Yes, deep learning is kind of amazing and very flexible compared to other generations of algorithms, but it’s not like the intelligent system I was studying when I studied the brain — at all.”

Microsoft says its new computer vision system can outperform humans

Microsoft researchers claim in a recently published paper that they have developed the first computer system capable of outperforming humans on a popular benchmark. While it’s estimated that humans can classify images in the ImageNet dataset with an error rate of 5.1 percent, Microsoft’s team said its deep-learning-based system achieved an error rate of only 4.94 percent.

Their paper was published less than a month after Baidu published a paper touting its record-setting system, which it claimed achieved an error rate of 5.98 percent using a homemade supercomputing architecture. The best performance in the actual ImageNet competition so far belongs to a team of Google researchers, who in the 2014 built a deep learning system with a 6.66 percent error rate.

A set of images that the Microsoft system classified correctly.

A set of images that the Microsoft system classified correctly. “GT” means ground truth; below are the top five predictions of the deep learning system.

“To our knowledge, our result is the first published instance of surpassing humans on this visual recognition challenge,” the paper states. “On the negative side, our algorithm still makes mistakes in cases that are not difficult for humans, especially for those requiring context understanding or high-level knowledge…

“While our algorithm produces a superior result on this particular dataset, this does not indicate that machine vision outperforms human vision on object recognition in general . . . Nevertheless, we believe our results show the tremendous potential of machine algorithms to match human-level performance for many visual recognition tasks.”

A set of images where the deep learning system didn't match the given label, although it did correctly classify objects in the scene.

A set of images where the deep learning system didn’t match the given label, although it did correctly classify objects in the scene.

One of the Microsoft researchers, Jian Sun, explains the difference in plainer English in a Microsoft blog post: “Humans have no trouble distinguishing between a sheep and a cow. But computers are not perfect with these simple tasks. However, when it comes to distinguishing between different breeds of sheep, this is where computers outperform humans. The computer can be trained to look at the detail, texture, shape and context of the image and see distinctions that can’t be observed by humans.”

If you’re interested in learning how deep learning works, why it’s such a hot area right now and how it’s being applied commercially, think about attending our Structure Data conference, which takes place March 18 and 19 in New York. Speakers include deep learning and machine learning experts from Facebook, Yahoo, Microsoft, Spotify, Hampton Creek, Stanford and NASA, as well as startups Blue River Technology, Enlitic, MetaMind and TeraDeep.

We’ll dive even deeper into artificial intelligence at our Structure Intelligence conference (Sept. 22 and 23 in San Francisco), where early confirmed speakers come from Baidu, Microsoft, Numenta and NASA.

No, you don’t need a ton of data to do deep learning

[soundcloud url=”″ params=”secret_token=s-lutIw&color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

There are a couple of seemingly contradictory memes rolling around the deep learning field. One is that you need a truly epic amount of data to do interesting work. The other is that in many subject areas there is a ton of data but it’s not just laying around for data scientists to snarf up.

On this week’s Structure Show podcast, Enlitic Founder and CEO Jeremy Howard and Senior Data Scientist Ahna Girshick address those topics and more.

Girshick, who is our first guest who’s worked with Philip Glass and Björk to create music visualizations, said there are scads of MRIs, CAT scans, x-rays created but once they’re used for their primary purpose — to diagnose your bum knee — they are then squirreled away in some PACS system never to see the light of day again.

All of that data is useful for machine learning algorithms, or would be, if it were accessible, she said.

Ahna Girshick, Enlitic's senior data scientist.

Ahna Girshick, Enlitic’s senior data scientist.

Girshick and  Howard agreed that while deep learning — the process of a computer teaching itself how to solve a problem — gets better as the data set grows, there’s no reason to hold off working with it to wait for that data to become available.

“While more data can be better I think this is stopping people from trying to use big data,” Howard said. He cited a recent Kaggle competition on facial key point recognition that  uses 7,000 images and “the top algorithms are nearly perfectly accurate.”

The reason companies like Baidu and Google say you need mountains of data is because they have mountains of data available, he said.  “I don’t think people should be put off trying to use deep learning just because they don’t have a lot of data.”

Enlitic is using deep learning to provide medical diagnoses faster and provide better medical and outcomes for millions of underserved people.

It’s a fascinating discussion so please check it out — Girshick will speak more on what Enlitic is doing at Structure Data next month.

And, if you want to hear what’s going on with Pivotal’s big data portfolio, Derrick Harris has the latest. Oh and Microsoft makes a bold play for startups by ponying up $500K in Azure cloud credits starting with the Y Combinator Winter 2015 class. That ups the ante pretty significantly compared to what [company]Amazon[/company] Web Services, [company]Google[/company] and [company]IBM[/company] offer. Your move boys.



Hosts: Barb Darrow and Derrick Harris.

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed


VMware wants all those cloud workloads “marooned” in AWS

Don’t like your cloud vendor? Wait a second.

Hilary Mason on taking big data from theory to reality

On the importance of building privacy into apps and Reddit AMAs

Cheap cloud + open source = a great time for startups 


Gigaom’s new AI and deep learning conference: Structure Intelligence

We are thrilled to announce the launch of Gigaom’s newest conference, Structure Intelligence. In the past year we’ve seen massive growth in Artificial Intelligence (AI) and deep learning. Our own Derrick Harris has been covering this area for years and we have decided it’s time to give this rapidly growing area a platform (and conference) of its own.

Structure Intelligence will take place September 22–23 in San Francisco at the Julia Morgan Ballroom. Early speaker confirmations include executives from Baidu, Microsoft, NASA and many more.

If you’re interested in speaking, sponsoring, attending, or partnering with us, please click on the appropriate link below and let us know.

We look forward to seeing you in September. Snag your ticket early and get a great rate.

PhotoTime is a deep learning application for the rest of us

A Sunnyvale, California, startup called Orbeus has developed what could be the best application yet for letting everyday consumers benefit from advances in deep learning. It’s called PhotoTime and, yes, it’s yet another photo-tagging app. But it looks really promising and, more importantly, it isn’t focused on business uses like so many other recent deep-learning-based services, nor has it been acquired and dissolved into Dropbox or Twitter or Pinterest or Yahoo.

Deep learning, to anyone unfamiliar with the term, is essentially a term for a class of artificial intelligence algorithms that excel at learning the latent features of the data they analyze. The more data that deep learning systems have to train on, the better they perform. The field has made big strides in recent years, largely with regard to machine-perception workloads such as computer vision, speech recognition and language understanding.

(If you want to get a crash course in what deep learning is and why web companies are investing billion of dollars into it, come to Structure Data in March and watch my interview with Rob Fergus of Facebook Artificial Intelligence Research, as well as several other sessions.)

The Orbeus team. L to R: TK, Yi Li, Wei Xia and Meng Wang.

The Orbeus team. L to R: Yuxin Wu, Yi Li, Wei Xia and Meng Wang.

I am admittedly late to the game in writing about PhotoTime (it was released in November) because, well, I don’t often write about mobile apps. The people who follow this space for a living, though, also seemed impressed with it when they reviewed it back then. Orbeus, the company behind PhotoTime, launched in 2012 and its first product is a computer vision API called ReKognition. According to CEO Yi Li, it has already raised nearly $5 million in venture capital.

But I ran into the Orbeus team at a recent deep learning conference and was impressed with what they were demonstrating. As an app for tagging and searching photos, it appears very rich. It tags smartphone photos using dozens of different categories, including place, date, object and scene. It also recognizes faces — either by connecting to your social networks and matching contacts with people in the photos, or by building collections of photos including the same face and letting users label them manually.

You might search your smartphone, for example, for pictures of flowers you snapped in San Diego, or for pictures of John Smith at a wedding in Las Vegas in October 2013. I can’t vouch for its accuracy personally because the PhotoTime app for Android isn’t yet available, but I’ll give it the benefit of the doubt.


More impressive than the tagging features, though — and the thing that could really set it apart from other deep-learning-powered photo-tagging applications, including well-heeled ones such as Google+, Facebook and Flickr — is that PhotoTime actually indexes the album locally on users’ phones. Images are sent to the cloud, ran through Orbeus’s deep learning models, and then the metadata is sent back to your phone so you can search existing photos even without a network connection.

The company does have a fair amount of experience in the deep learning field, with several members, including research scientist Wei Xia, winning a couple categories at last year’s ImageNet object-recognition competition as part of a team from the National University of Singapore. Xia told me that while PhotoTime’s application servers run largely on Amazon Web Services, the company’s deep learning system resides on a homemade, liquid-cooled GPU cluster in the company’s headquarters.

Here’s what that looks like.

The Orbeus GPU cluster.

The Orbeus GPU cluster.

As I’ve written before, though, tagging photos is only part of the ideal photo-app experience, and there’s still work to do there no matter how nice the product functions. I’m still waiting for some photo application to perfect the curated photo album, something Disney Research is working on using another machine learning approach.

And while accuracy continues to improve for recognizing objects and faces, researchers are already hard at work applying deep learning to everything from recognizing the positions of our bodies to the sentiment implied by our photos.

TeraDeep wants to bring deep learning to your dumb devices

Open the closet of any gadget geek or computer nerd, and you’re likely to find a lot of skeletons. Stacked deep in a cardboard box or Tupperware tub, there they are: The remains of webcams, routers, phones and other devices deemed too obsolete to keep using and left to rot, metaphorically speaking, until they eventually find their way to a Best Buy recycling bin.

However, an under-the-radar startup called TeraDeep has developed a way to revive at least a few of those old devices by giving them the power of deep learning. The company has built a module that it calls the CAMCUE, which runs on an ARM-based processor and is designed to plug into other gear and run deep neural network algorithms on the inputs they send through. It could turn an old webcam into something with the smart features of a Dropcam, if not smarter.

“You can basically turn our little device into anything you want,” said TeraDeep co-founder and CTO Eugenio Culurciello during a recent interview. That potential is why the company won a Structure Data award as one of most-promising startups to launch in 2014, and will be presenting at our Structure Data conference in March.

Didier Lacroix (left) and Eugenio Culurciello (right)

Didier Lacroix (left) and Eugenio Culurciello (right)

But before TeraDeep can start transforming the world’s dumb gear into smart gear, the company needs to grow — a lot. It’s headquartered in San Mateo, California, and is the brainchild of Culurciello, who moonlights as an associate professor of engineering at Purdue University in Indiana. It has 10 employees, only three of which are full-time. It has a prototype of the CAMCUE, but isn’t ready to start mass-producing the modules and getting them into developers’ hands.

I recently saw a prototype of it at a deep learning conference in San Francisco, and was impressed by its how well it worked, albeit in a simple use case. Culurciello hooked the CAMCUE up to a webcam and to a laptop, and as he panned the camera, the display on the computer screen would alert the presence of a human when I was in the shot.

“As long as you look human-like, it’s going to detect you,” he said.

The prototype system can be set to detect a number of objects, including iPhones, which it was able to do when the phone was held vertically.

teradeep setup

The webcam setup on a conference table.

TeraDeep also has developed a web application, software libraries and a cloud platform that Culurciello said should make it fairly easy for power users and application developers, initially, and then perhaps everyday consumers to train TeraDeep-powered devices to do what they want them to do. It could be “as easy as uploading a bunch of images,” he said.

“You don’t need to be a programmer to make these things do magic,” TeraDeep CEO Didier Lacroix added.

But Culurciello and Lacroix have bigger plans for the company’s technology — which is the culmination of several years of work by Culurciello to develop specialized hardware for neural network algorithms — than just turning old webcams into smarter webcams. They’d like the company to become a platform player in the emerging artificial intelligence market, selling embedded hardware and software to fulfill the needs of hobbyists and large-scale device manufacturers alike.

A TeraDeep module, up close.

A TeraDeep module, up close.

It already has a few of the pieces in place. Aside from the CAMCUE module, which Lacroix said will soon shrink to about the surface area of a credit card, the company has also tuned its core technology (called nn-x, or neural network accelerator) to run on existing smartphone platforms. This means developers could build mobile apps that do computer vision at high speed and low power without relying on GPUs.

TeraDeep has also worked in system-on-a-chip design for partners that might want to embed more computing power into their devices. Think drones, cars and refrigerators, or smart-home gadgets a la the Amazon Echo and Jibo that rely heavily on voice recognition.

Lacroix said all the possibilities, and the interest it has received from folks who’ve seen and heard about the technology, are great, but noted that it might lead such a small company to suffer from a lack of focus or perhaps option paralysis.

“It’s overwhelming. We are a small company, and people get very excited,” he said. “… We cannot do everything. That’s a challenge for us.”

Why the real world is the next big opportunity in big data

Matt Ocko had been investing in data-based companies for years before he and Zack Bogue officially launched Data Collective Venture Capital in 2012. Since then, the firm has made seed investments in nearly every hot startup in the space, ranging from database companies such as MemSQL to satellite companies such as Planet Labs. This week, Ocko came on the Structure Show podcast to talk about what has him excited as an investor and what’s just overdone.

He has interesting opinions on a lot of topics in the tech space, all of which are worth hearing, but here are the highlights from our interview. If you want to hear more from Ocko, he’ll be speaking at Structure Data (March 18-19 in New York) alongside last week’s podcast guest, Hilary Mason. It should make for a really fun talk about the future of data.

As it turns out, a handful of Data Collective portfolio companies will also be presenting, including Enlitic, Blue River Technology and Interana.

[soundcloud url=”″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed

Useful doesn’t always means sexy

As sexy as the Snapchats and Slacks of the world are, Ocko is excited to see elite developers — who he says can do wonders with today’s infrastructure technologies — turn their attention toward applications such as supply-chain management or agriculture that can have “a material impact on a pretty big swath of GDP.” These spaces might seem unglamorous, and the companies and technologies might have to sneak up on the world, “a little contrarian and a little unloved,” but the rewards can be huge, Ocko said.

Just look at [company]SAP[/company], he explained:

“[T]hey said, ‘Hey, let’s integrate your accounting, and your supply chain, and your manufacturing and planning, and we’ll tell you what’s coming into your factories, how much of it is being made, how much it costs you and how much you sold it for.’ And that was transformative for manufacturing. That was more transformative, I would argue, than early industrial robots. People’s brains melted out of their ears. This was a massive operational advantage for these companies.”

Or the personal computer, which Ocko said didn’t get a whole lot of love when the concept was first introduced in the 1970s: “And it was way more transformative than any 1,000 mainframes you could have built. It created the industry that we have today.”


Matt Ocko

‘Software eats glassware’

Specifically, Ocko said Data Collective is excited about the potential for new technologies to underpin companies that can fundamentally improve lives, often by focusing on difficult scientific challenges.

“We call it ‘software eats glassware,'” he said. “We see stuff happening in computational biology and related informatics fields where you can model living things without huge capex, or even opex, and get insight into making people and animals and, heck, the planet itself a lot healthier in a way that’s consistent with the capitalist system that we all have to live in, but has profound positive impact while you’re making money.”

Betting on applications, not technologies — even deep learning

Even in the hottest of hot technology areas, Ocko says the focus from an investment standpoint still needs to be whether the technology has real and necessary applications, not just some cool research and maybe a big name. He invested in deep-learning-for-medical-imaging startup Enlitic, for example, but isn’t keen on building a portfolio of deep learning startups just because they’re getting acquired like mad right now.

“Just kind of crossing our fingers and praying for a DeepMind $400 million exit … because people in the company are so brilliant feels kind of dot.commy to me,” Ocko said. “. . . That’s just a recipe for blowing your [limited partners’] money.”

We’re good on Hadoop and marketing software for now

Even with some truly innovative and truly transformative technologies, though, there comes a point when markets become saturated. That doesn’t mean startups in those spaces will crash and burn — they could build nice companies — but they do become less attractive as investment opportunities.

One of them is Hadoop, which Ocko says still could use a lot of finessing, but has probably peaked in terms of producing huge valuations. “If you are lighting up a Hadoop cluster of some sort, there’s still a lot ‘Hadoop for x’ that’s probably needed to make your life easier,” he said. “But to your point, I’m not sure those are giant companies anymore. They may well be very nice, but not homerun — from a VC perspective — acquisitions for a Hortonworks or MapR or Cloudera.”

Another is sales and marketing software souped up with machine learning. “The number of companies I’ve seen pursuing closely related pipeline-mining opportunities, either for marketing optimization or for sales optimization, is literally over 100 now,” Ocko said. “When there are that many closely related companies with similar ideas, I think you’re potentially headed for tragedy.”

New to deep learning? Here are 4 easy lessons from Google

Google employs some of the world’s smartest researchers in deep learning and artificial intelligence, so it’s not a bad idea to listen to what they have to say about the space. One of those researchers, senior research scientist Greg Corrado, spoke at RE:WORK’s Deep Learning Summit on Thursday in San Francisco and gave some advice on when, why and how to use deep learning.

His talk was pragmatic and potentially very useful for folks who have heard about deep learning and how great it is — well, at computer vision, language understanding and speech recognition, at least — and are now wondering whether they should try using it for something. The TL;DR version is “maybe,” but here’s a little more nuanced advice from Corrado’s talk.

(And, of course, if you want to learn even more about deep learning, you can attend Gigaom’s Structure Data conference in March and our inaugural Structure Intelligence conference in September. You can also watch the presentations from our Future of AI meetup, which was held in late 2014.)

1. It’s not always necessary, even if it would work

Probably the most-useful piece of advice Corrado gave is that deep learning isn’t necessarily the best approach to solving a problem, even if it would offer the best results. Presently, it’s computationally expensive (in all meanings of the word), it often requires a lot of data (more on that later) and probably requires some in-house expertise if you’re building systems yourself.

So while deep learning might ultimately work well on pattern-recognition tasks on structured data — fraud detection, stock-market prediction or analyzing sales pipelines, for example — Corrado said it’s easier to justify in the areas where it’s already widely used. “In machine perception, deep learning is so much better than the second-best approach that it’s hard to argue with,” he explained, while the gap between deep learning and other options is not so great in other applications.

That being said, I found myself in multiple conversations at the event centered around the opportunity to soup up existing enterprise software markets with deep learning and met a few startups trying to do it. In an on-stage interview I did with Baidu’s Andrew Ng (who worked alongside Corrado on the Google Brain project) earlier in the day, he noted how deep learning is currently powering some ad serving at Baidu and suggested that data center operations (something Google is actually exploring) might be a good fit.

Greg Corrado

Greg Corrado

2. You don’t have to be Google to do it

Even when companies do decide to take on deep learning work, they don’t need to aim for systems as big as those at Google or Facebook or Baidu, Corrado said. “The answer is definitely not,” he reiterated. “. . . You only need an engine big enough for the rocket fuel available.”

The rocket analogy is a reference to something Ng said in our interview, explaining the tight relationship between systems design and data volume in deep learning environments. Corrado explained that Google needs a huge system because it’s working with huge volumes of data and needs to be able to move quickly as its research evolves. But if you know what you want to do or don’t have major time constraints, he said, smaller systems could work just fine.

For getting started, he added later, a desktop computer could actually work provided it has a sufficiently capable GPU.

3. But you probably need a lot of data

However, Corrado cautioned, it’s no joke that training deep learning models really does take a lot of data. Ideally as much as you can get yours hands on. If he’s advising executives on when they should consider deep learning, it pretty much comes down to (a) whether they’re trying to solve a machine perception problem and/or (b) whether they have “a mountain of data.”

If they don’t have a mountain of data, he might suggest they get one. At least 100 trainable observations per feature you want to train is a good start, he said, adding that it’s conceivable to waste months of effort trying to optimize a model that would have been solved a lot quicker if you had just spent some time gathering training data early on.

Corrado said he views his job not as building intelligent computers (artificial intelligence) or building computers that can learn (machine learning), but as building computers that can learn to be intelligent. And, he said, “You have to have a lot of data in order for that to work.”

Source: Google

Training a system that can do this takes a lot of data.

4. It’s not really based on the brain

Corrado received his Ph.D. in neuroscience and worked on IBM’s SyNAPSE neurosynaptic chip before coming to Google, and says he feels confident in saying that deep learning is only loosely based on how the brain works. And that’s based on what little we know about the brain to begin with.

Earlier in the day, Ng said about the same thing. To drive the point home, he noted that while many researchers believe we learn in an unsupervised manner, most production deep learning models today are still trained in a supervised manner. That is, they analyze lots of labeled images, speech samples or whatever in order to learn what it is.

And comparisons to the brain, while easier than nuanced explanations, tend to lead to overinflated connotations about what deep learning is or might be capable of. “This analogy,” Corrado said, “is now officially overhyped.”

Update: This post was updated on Feb. 2 to correct a statement about Corrado’s tenure at Google. He was with the company before Andrew Ng and the Google Brain project, and was not recruited by Ng to work on it, as originally reported.

Hilary Mason on taking big data from theory to reality

[soundcloud url=”” params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

If you’re interested in assessing how and when a given data technology — deep learning, machine intelligence, natural language generation — can move from the theoretical to commercial use,  Hilary Mason may have the best job around. This week’s guest, the CEO and Founder of Fast Forward Labs, talks about how that startup taps into a wide array of expertise sources– from academic and commercial research, the open source world to “outsider art” in the realms of spam and malware, to come up with new ideas for applications.

One natural language generation (NLG) project, for example, lets a person who wants to sell her house, enter the parameters — square footage, number of rooms etc — then step back to let the system write up the ad for that property. (As a person who makes her living from writing words, all I can say is: “ick.”)

She’s also got an interesting take on opportunities in the internet of things — a term she dislikes — and why the much-maligned title of data scientist has validity. Mason is really interesting so if you’re pressed for time, check out at least the second half of this podcast. And to hear more from her, be sure to sign up for Structure Data in March, where she will return to speak in March.

Shivon Zilis, VC, Bloomberg Beta; Sven Strohband, Partner and CTO, Khosla Ventures; Hilary Mason, Data Scientist in Residence, Accel Partners; Jalak Jobanputra, Managing Partner, FuturePerfect Ventures.

Shivon Zilis, VC, Bloomberg Beta; Sven Strohband, Partner and CTO, Khosla Ventures; Hilary Mason, Data Scientist in Residence, Accel Partners; Jalak Jobanputra, Managing Partner, FuturePerfect Ventures.

As for segment one, Derrick and I discuss Datapipe’s acquisition of GoGrid, the first cloud consolidation move of the new year; the long-awaited Box IPO; and an itty bit on Microsoft’s foray into augmented reality.

So get cozy and take a listen.


Hosts: Barb Darrow and Derrick Harris.

Download This Episode

Subscribe in iTunes

The Structure Show RSS Feed


On the importance of building privacy into apps and Reddit AMAs

Cheap cloud + open source = a great time for startups 

It’s all Docker containers and the cloud on the Structure Show

Mo’ money, mo’ data, mo’ cloud on the Structure Show

Why CoreOS went its own way on containers