Baidu’s Andrew Ng talks AI with Gigaom

andrew_ng_colorDr. Andrew Ng joined Baidu in May 2014 as chief scientist. He is responsible for driving the company’s global AI strategy and infrastructure. He leads Baidu Research in Beijing and Silicon Valley as well as technical teams in the areas of speech, big data and image search. In addition to his role at Baidu, Dr. Ng is an adjunct professor in the computer science department at Stanford University. In 2011 he led the development of Stanford’s Massive Open Online Course (MOOC) platform and taught an online machine learning class that was offered to over 100,000 students. This led to the co-founding of Coursera, where he continues to serve as chairman. Previously, Dr. Ng was the founding lead of the Google Brain deep learning project. Dr. Ng has authored or co-authored over 100 research papers in machine learning, robotics and related fields. In 2013 he was named to the Time 100 list of the most influential persons in the world. He holds degrees from Carnegie Mellon University, MIT and the University of California, Berkeley.
Andrew will be speaking at the Gigaom AI in San Francisco, February 15-16th. In anticipation of that, I caught up with him to ask a few questions.

Tell us about your daily work at Baidu. What does your AI team do?
We’re involved in developing basic AI technology. Everything ranging from speech recognition to computer vision, to NLP, to data warehousing, to user understanding; and using this technology to support a lot of Baidu internal businesses, as well as incubate new directions. So for example, within Baidu, all of our major lines of business have already been transformed using AI. Everything from web search to advertising, to machine translation, to the way that we recommend restaurants to users. So, AI is already pervasive throughout Baidu. In addition to that, we see a lot of new opportunities that are created by AI, such as better conversation-based, this is chat box-based, medical assistant, or using face recognition to build turnstiles that open up automatically when an authorized person approaches it. So all of our teams are pursuing those new vertical opportunities, as well.
Would any of what you do fall under the category of basic research? Do you ever do things because, say, this might be useful, but we don’t exactly know how?
We do a lot of work in basic research, and you know, it’s interesting how successful basic research starts out as basic research, but after some period of time becomes less basic, once you see the application value. So, we’ve had a lot of that. I would say that within Baidu, our early face recognition work had started out as what felt like basic research, but now this service is in production, and really serving hundreds of millions of users. Our early work in neural machine translation, had started out as basic research. In fact one aspect of this story that’s not widely known: Neural machine translation was a technology first pioneered and developed and shipped in China. The U.S. companies developed and shipped it well after Baidu, and so I think this is one example of a place where our China teams definitely led the way. I feel that our basic research computer vision, in, for example, face rec, has also been pioneering. Today, we’re doing basic research on learning robots and machine learning broadly. I think that our research covers the whole spectrum, from very basic to very applied.
What would a team look like at Baidu? Do you generally have small teams? Are they developer-heavy? What have you found to be a successful way to allocate limited resources?
It’s a complicated question. I will say a lot of our projects start small. For example, a year ago, our autonomous driving team was 22 people. But after the team developed some traction, and showed initial promise, and developed a well thought-through business plan, that allowed us to justify pouring tremendously larger amounts of resources, now maybe hundreds of people, into a team to build, what started as a basic research project into a brand new business direction. So we’ve had a lot of projects start from relatively small teams, but after they’ve shown traction and the value was clear, we wound up building it into teams of many dozens, or even a few hundred people.
What do you think is a hard problem in artificial intelligence today, that we’ll solve in five years? What’s something that would be very difficult today to do, but in five years’ time, it’ll be commonly done?
From a research perspective, I think that transfer learning and multi-task learning is one of the areas that I would love to figure out. Most of the economic value of machine learning, today, is applied learning, learning from a lot of labeled data that was labeled for the specific task you’re trying to solve, such as trying to learn to recognize faces from a huge database of faces that are labelled. For a lot of tasks, we just don’t have enough data in the specific vertical that we want to build our system for. So one of the up and coming areas is transfer learning, where you have a machine learn a different task. For example, learn to recognize objects in general. And having learned to recognize objects in general, how much of that knowledge becomes useful for the specific purpose of recognizing faces.
I’m seeing very, very promising traction from the research perspective, and there are techniques that are now widely used for this type of transfer learning, but it feels that the theory and best practices of how to do this, is still in its incredibly early stages. The reason we are excited by transfer learning is because the modern deep learning has proved incredibly valuable for problems where we do have a lot of data. This has been a huge driver of value across many, many applications. But there are also a lot of problems where we just don’t have that much data. Take speech recognition. There are some languages, envision Mandarin Chinese, for which we have a ton of data. But there are also some languages, you know, spoken by small populations, and we’ll never get a huge dataset of those languages. So is it possible to take what we learn from, say Mandarin, and transfer that knowledge, in order to do speech recognition for a Chinese dialect that is spoken for a much smaller population, and for which we therefore have a much smaller amount of data? We do have techniques to do this, and we are doing it today, but I think that advances in this research area will allow AI to tackle a much broader range of problems.
Does artificial intelligence tell us anything useful about human intelligence? Or, conversely, do we use cues from human intelligence to make AI work better, or is it the case that they both share the word intelligence, but they’re nothing alike at all?
The knowledge from neural science has been only a little bit useful for the recent developments of artificial intelligence. Realistically, despite all the centuries of work in neural science, I think today we have almost no idea how the human brain works, and so the incredibly little we know about how the human brain works has served as loose inspiration for AI, but realistically, much more AI progress today is driven by computer science principles than by neural science principles. Having said that, AI has become incredibly good at automating things that people can do. For example, people are very good at recognizing speech, and AI speech rec falls in that. People are very good at recognizing faces. AI’s made rapid progress on that.
It turns out that our tools for advancing a piece of AI technology tend to work better when we’re trying to automate a task that a human can do, rather than try to tackle a task that even humans cannot do. And there are many reasons for this, but one of the reasons is, when we’re trying to choose a task humans can do, and the AI gets involved, then it allows us to go in and try to figure out how this human has done the task more rapidly. So when I look across the many verticals of AI, we can certainly do some things that even humans aren’t very good at. I think, today, Amazon probably recommends books to me even better than my wife can. My wife knows me fairly well. And this is because Amazon has ingested a ton of data about what books I browse and read on their website much more than my wife would possibly watch over my shoulder exactly what I’m reading. But with a few exceptions like that, I think by and large, I feel like I’m seeing this as very fast progress, when AI is trying to automate things that, at least that humans can also do.
You are always the first to be conservative in your projections about what AI can accomplish, and I assume part of the reason you do that is a memory of how excessive expectations have had catastrophic effects on the science, especially with regard to funding. Is that correct?
I tend to be very practical, and I try to be a realist. But I want to offer a slightly different perspective on that. If I set out to build a team to cure all human diseases, that would be celebrated. That sounds like a great mission to work on. But frankly, sometimes there is a cost to aiming too high, and this is an unpopular and definitely contrarian view in today’s Silicon Valley, where we like to talk about how you should aim for the moon, because even if you miss, you’ll still be in the stars. But I think that, realistically, there is a cost to aiming too high, which is that, maybe, instead of building a team to solve all of the diseases in the world, it might be more productive, you might actually do more good for the world, to aim to solve malaria. So I feel like there are some very significant changes we can make to the world using AI. I think we can transform transportation through self-driving cars, but also through logistics, with AI. I think we can completely transform healthcare with AI. There are major changes we can make in the world, through AI. So a lot of my efforts are in building out toward these concrete, doable things, because I think that that’s actually more productive for the world, than spending all of our time building toward a science fiction which might not come to fruition for maybe even hundreds of years. And I get that this is an unpopular, contrarian view in Silicon Valley.
Having said that, as a society, we should do all sorts of things. So I think the world is a better place, that we have some people working to solve malaria, and hopefully, through the work of Gates Foundations and World Health Organization, and others, maybe we’re getting real traction against malaria. We should also have some people working to solving every single human disease under the sun. I think it’s a good thing that society allocates its resources in a diverse set of ways. But I do think that it’s helpful, for the progress of our field, when we think through ‘what are the tasks that we are confident that we can achieve?’, versus ‘what are the tasks that are further-off dreams that we should invest in?’ Parts of my teams do work on [these], but it’s only a fraction of our overall efforts.
Do you believe in the possibility of a AGI and, if so, do you believe it is something that will be achieved along evolutionary paths of techniques that we know, and Moore’s Law blowing at our back and all of that, or would an AGI require a whole fundamental breakthrough, something we can’t even anticipate?
I think achieving AGI would definitely require multiple breakthroughs. It may well happen. Both [through] software algorithmic breakthroughs and likely the hardware breakthroughs. Whether that breakthrough will come in ten or a hundred or a thousand years from now, though, I find that hard to predict.
Do you think that human creativity, such as the ability to write a screenplay or a great novel, does that require an AGI, or is that within our grasp, with the technology we have now?
I think a lot of creativity is when we don’t understand the process by which something was created. For example, Garry Kasparov said that he saw creativity Deep Blue’s moves. And as a technologist, I know how chess programs run. Through their process of throwing an amazing amount of computation at the task, they are enabled to execute chess moves that even chess masters thought were creative. Having been involved in the creative process myself, creativity is much more hard work, and the incremental putting together of many small pieces, that builds up to a large thing that looks like it came out of nowhere. But if someone didn’t see all the small pieces, and how hard it was to assemble all the small pieces together into this creative thing, sometimes I think creativity looks more magical from the outside, than it does from the inside.
My artist friends practice individual brush strokes, over and over, and draw similar paintings over and over, and make incremental progress. My grandmother was a painter and made incremental progress toward an amazing work of art, that when you see only the final product, rather than all of the baby steps that it took to get there, I think it feels more magical than if you were the one that had to do all the work to make all the little incremental pieces of progress.
So if reading between the lines of what you just said, human creativity is computational and attainable, on some time scale that’s reasonable. It’s not something mystical or other-worldly, that’s beyond what we know how to do. Would you agree?
Yeah. Either through executing the occasional brilliant chess moves, finding an interpretation of a sentence that a human hadn’t thought of before, to creating a simple work of art, I think that we’re seeing machine behavior that one could say is somewhat creative. We’ll probably continue to see incremental progress, and machines gradually becoming more “creative” over the next several years.
How is your team distributed geographically? Where are most of the people?
Mainly in Beijing. We have a team of about 100 in the United States, and then we have also large teams in Beijing, and some smaller teams in Shanghai and Shenzhen.
With robotics, you can see national and regional priorities emerging. In Japan, for instance, there’s more of an emphasis on making robots friendly, and be things that you can emotionally connect with, than in other parts of the world. Are there things like that in artificial intelligence? Because you mentioned earlier about face recognition coming out of China. Are there things in AI that different companies, or different regions, or different countries, maybe look at differently?
I think face recognition is one example where economic pressures and a brilliant business model has driven a lot of progress in China. I think from the product level, the different business pressures and product priorities have caused different countries to invest more or less aggressively in different areas. Some examples in China; in China, typing Chinese characters on a cell phone keyboard is even more painful than typing English on a cell phone keyboard. So that, in turn, drove a lot of pressure for better mobile phone speech recognition. So I feel a lot of the speech recognition breakthroughs were pioneered by Baidu because of this strong product pressure to get speech recognition working for users.
I think that machine translation – you know, actually, there’s been a lot of PR in the U.S. on neural machine translation. What many people might not be aware is that neural machine translation was first pioneered and developed, and shipped as a product in China. Large U.S. companies came later, and I think one of the reasons for this is that in China, there is a huge desire for translation of public content into Chinese, whereas in the English speaking world, there is a lot of English content. There is a lot of Chinese content as well, but foreign content gets translated into Chinese very rapidly, just as a cultural phenomenon, whereas there is so much English content in the world, I think there’s less pressure for English speakers to have access to foreign-language content.
I think the face recognition as a business is taking off rapidly in China, because being a mobile first society, people in China are used to making significant financial transactions. For example, you can get an education loan from Baidu, and then we literally send you a lot of money, just based on your loan application that you carry out on your mobile phone. And when we’re sending someone a significant sum of money through their mobile phone, then we have a very strong interest in verifying the identity of that person. So face recognition has come to the forefront as a key technology for doing this. So those pressures mean that your face recognition is another area that I feel is taking off very fast in China, faster than other countries.
We have a lot of innovation in AI, in both the U.S. and in China. And there are other areas, I guess. I would say that the UK has been invested heavily in AI for playing video games. I’m personally not investing in AI for playing video games, but I guess there are different interests and priorities in different organizations.
You know, I think that AI progress today is a global phenomenon, and I feel that there’s a lot of innovation happening in China that the English-speaking world is not aware of. This is not a secrecy issue. I think it’s just a lack of fluency issue. For example, I was at the NIPS conference just a few weeks ago, and within a day of all of the most significant talks presented at NIPS, were already summarized and transcribed into Chinese, and posted online in China. So the transfer of knowledge from an English-language speaking conference in Barcelona, into the Chinese language, was very fast and very efficient where, within a day, many researchers in China could read, in Chinese, what was presented in English, in Spain. I think the fact that many people in China speak and read English fluently, makes this possible.
I think that, unfortunately, transfer of knowledge in the opposite direction is much slower, just because when you look globally, many researchers today, outside of China, do not speak Chinese. So there are many things that are invented, even widely publicized in China, that the English-speaking audience does not find out about, sometimes even for a year, until an English-speaking company invents something similar. So I think that one of the things I hope I can do is help with increasing the velocity of knowledge transfer in the opposite direction, as well, because if we can make the research community a more global one, then I think the whole global research community will advance fast.
And I think there is some specific examples. I think that it was first in China, that Mandarin language speech recognition for short phrases surpassed human level recognition over a year ago, but somehow this result wasn’t widely known globally, until more recently. I’ve seen many examples, ranging from advances in speech recognition to advances in neural machine translation, to advances in building up GPU processes for deep learning that were first invented in China, but that I wish had made it to the U.S. sooner as well, after its initial invention in China.
Are there any sites or periodicals, that you would suggest to our readers, so that they can find information like easier?
Knowledge in China is disseminated in ways that are different than in the U.S. AI knowledge is disseminated very rapidly on social media in China, in a way that’s very intense, and is hard to understand if you haven’t experienced it yourself. And then I would say there are many websites, as well, but a lot of these are Chinese language websites. Follow me on Twitter. I’ll see what I can do.
Do you have an opinion what human consciousness is? Or more specifically, do you believe human consciousness is fundamentally computational?
I don’t know what consciousness is. In philosophy, there’s a debate about whether the people around you are really conscious, or they’re just zombies, or automatons that through computation are acting as if they are conscious. After all, how do we know if anyone other than ourselves is truly conscious, or if they’re an automaton? I don’t see consciousness as something fundamentally that computers will never achieve, but exactly how to get there, and how many decades or centuries are needed to get there, is not clear.
Join us at Gigaom AI in San Francisco, February 15-16th where Andrew Ng will speak further on the subject of AI.