Voices in AI – Episode 2: A Conversation with Oren Etzioni

In this episode Byron and Oren talk about AGI, Aristo, the future of work, conscious machines, and Alexa.
[podcast_player name=”Episode 2: A Conversation with Oren Etzioni” artist=”Byron Reese” album=”Voices in AI” url=”https://voicesinai.s3.amazonaws.com/2017-09-28-(00-57-00)-oren-etzioni.mp3″ cover_art_url=”https://voicesinai.com/wp-content/uploads/2017/09/voices-headshot-card-1.jpg”]
Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today, our guest is Oren Etzioni. He’s a professor of computer science who founded and ran University of Washington’s Turing Center. And since 2013, he’s been the CEO of the Allen Institute for Artificial Intelligence. The Institute investigates problems in data mining, natural language processing, and the semantic web. And if all of that weren’t enough to keep a person busy, he’s also a venture partner at the Madrona Venture Group. Business Insider called him, quote: “The most successful entrepreneur you’ve never heard of.”
Welcome to the show, Oren.
Oren Etzioni: Thank you, and thanks for the kind introduction. I think the key emphasis there would be, “you’ve never heard of.”
Well, I’ve heard of you, and I’ve followed your work and the Allen Institute’s as well. And let’s start there. You’re doing some fascinating things. So if you would just start off by telling us a bit about the Allen Institute, and then I would love to go through the four projects that you feature prominently on the website. And just talk about each one; they’re all really interesting.
Well, thanks. I’d love to. The Allen Institute for AI is really Paul Allen’s brainchild. He’s had a passion for AI for decades, and he’s founded a series of institutes—scientific institutes—in Seattle, which were modeled after the Allen Institute for Brain Science, which has been very successful running since 2003. We were founded—got started—in 2013. We were launched as a nonprofit on January 1, 2014, and it’s a great honor to serve as CEO. Our mission is AI for the common good, and as you mentioned, we have four projects that I’m really excited about.
Our first project is the Aristo project, and that’s about building a computer program that’s able to answer science questions of the sort that we would ask a fourth grader, and now we’re also working with eighth-grade science. And people sometimes ask me, “Well, gosh, why do you want to do that? Are you trying to put 10-year-olds out of work?” And the answer is, of course not.
We really want to use that test—science test questions—as a benchmark for how well are we doing in intelligence, right? We see tremendous success in computer programs like AlphaGo, beating the world champion in Go. And we say, “Well, how does that translate to language—and particularly to understanding language—and understanding diagrams, understanding science?”
And one way to answer that question is to, kind of, level the playing field with, “Let’s ask machines and people the same questions.” And so we started with these science tests, and we can see that, in fact, people do much better. It turns out, paradoxically, that things that are relatively easy for people are really quite hard for machines, and things that are hard for people—like playing Go at world championship level—those are actually relatively easy for the machine.
Hold on there a minute: I want to take a moment and really dissect this. Any time there’s a candidate chatbot that can make a go at the Turing test, I have a standard question that I start with, and none of them have ever answered it correctly.
It’s a question a four-year-old could answer, which is, “Which is bigger, a nickel or the sun?” So why is that a hard problem? Is what you’re doing, would it be able to answer that? And why would you start with a fourth grader instead of a four-year-old, like really go back to the most basic, basic questions? So the first part of that is: Is what you’re doing, would it be able to answer the question?
Certainly our goal is to give it the background knowledge and understanding ability to be able to answer those types of questions, which combine both basic knowledge, basic reasoning, and enough understanding of language to know that, when you say “a nickel,” you’re not referring to the metal, but you’re referring to a particular coin, with a particular size, and so on.
The reason that’s so hard for the machine is that it’s part of what’s called ‘common sense’ knowledge, right? Of course, the machine, if you programmed it, could answer that particular question—but that’s a stand-in for literally billions of other questions that you could ask about relative sizes, about animal behavior, about the properties of paper versus feathers versus furniture.
There’s really a seemingly infinite—or certainly a very, very large number—of basic questions that people, that certainly eight-year-olds can answer, or four-year-olds, but that machines struggle with. And they struggle with it because, what’s their basis for answering the questions? How would they acquire all that knowledge?
Now, to say, “Well, gosh, why don’t we build a four-year-old, or maybe even a one-year-old?” I’ve actually thought about that. So at the university, we investigated for a summer, trying to follow the developmental ladder, saying: “Let’s start with a six-month-old, and a one-year-old, etc., etc.”
And my interest, in particular, is in language. So I said, “Well, gosh, surely we can build something that can say ‘dada’ or ‘mama’, right?” And then work our way from there. What we found is that, even a very young child, their ability to process language and understand the world around them is so involved with their body—with their gaze, with their understanding of people’s facial expressions—that the net effect was that we could not build a one-year-old.
So, in a funny way, once you’re getting to the level of a fourth grader, who’s reading and answering multiple choice science questions, it gets easier and it gets more focused on language and semantics, and less on having a body, being able to crawl—which, of course, are challenging robotics problems.
So, we chose to start higher up in the ladder, and it was kind of a Goldilocks thing, right? It was more language-focused and, in a funny way, easier than doing a one-year-old, or a four-year-old. And—at the same time—not as hard as, say, college-level biology questions or AP questions, which involve very complicated language and reasoning.
So it’s your thinking that by talking about school science examinations, in particular, that you have a really, really narrow vocabulary that you have to master, a really narrow set of objects you have to understand the property of, is that the idea? Like, AI does well at games because they’re constrained worlds with fixed rules. Are you trying to build that, an analog to that?
It is an analog, right? In the sense that AI has done well with having narrow tasks and, you know, limited domains. At the same time, it’s probably not the word, really. There are, if you look—and this is something that we’ve learned—at the tremendous variety in these questions, and not only variety of ways of saying things, but also variety because these tests often require you to take something that you could have an understanding of—like gravity or photosynthesis—but then apply it to a particular situation.
“What happens if we take a plant and move it nearer to the window?” So that combination means that the combination of basic scientific knowledge, with an application to a real-world situation, means that it’s really quite varied. And it’s really a much harder AI problem to answer fourth-grade science questions than it is to solve Go.
I completely get that. I’m going to ask you a question, and it’s going to sound like I’m changing the topic, but it is germane. Do you believe that we’re on a path to building an AGI—a general intelligence? You’re going to learn things doing this, and is it, like, all we will need to do is scale them up more and more, faster, faster, better and better, and you’ll have an AGI? Is this on that trajectory, or is an AGI something completely unrelated to what you’re trying to do here?
That’s a very, very key question. And I would say that we are not on a path to building an AGI—in the sense that, if you build Aristo, and then you scale it to twelfth grade, and more complex vocabulary, and more complex reasoning, and, “Hey, if we just keep scaling this further, we’ll end up with artificial general intelligence, with an AGI.” I don’t think that’s the case.
I think there are many other problems that we have to solve, and this is a part of a very complex picture. And if it’s a path, it’s a very meandering one. But really, the point is that the word “research,” which is obviously what we’re doing here, has the word “search” in it. And that means that we’re iterating, we’re going here, we’re going there, we’re looking, you know.
“Oh, where did I put my keys?” Right? How many times do you retrace your steps and open that drawer, and say, “Oh, but I forgot to look under the socks,” or “I forgot to look under the bed”? It’s this very complex, uncertain process; it’s quite the opposite of, “Oh, I’m going down the path, the goal is clear, and I just have to go uphill for five miles, and I’ll get there.”
I’ve got a book on AI coming out towards the end of this year, and in it, I talk about the Turing test. And I talk about, like, the hardest question I can think of to ask a computer so that I could detect if it’s a computer or a person. And here’s a variant of what I came up with, which is:
“Doctor Smith is eating at his favorite restaurant, that he eats at frequently. He gets a call, an emergency call, and he runs out without paying his bill. Are the owners likely to prosecute?” So, if you think about that… Wow, you’ve got to know he’s a doctor, the call he got is probably a medical emergency, you have to infer that he eats there a lot, that they know who he is, they might even know he’s a doctor. Are they going to prosecute? So, it’s a gazillion social things that you have to know in order to answer that question.
Now, is that also on the same trajectory as solving twelfth grade science problems? Or is that question that I posed, would that require an AGI to answer?
Well, one of the things that we’ve learned is that, whenever you define a task—say answering story types of questions that involve social nuance, and maybe would involve ethical and practical considerations—that is on the trajectory of our research. You can imagine Aristo, over time, being challenged by these more nuanced questions.
But, again, we’ve gotten so good at identifying those tasks, building training sets, building models and then answering those questions, and that program might get good at answering those questions but still have a hard time crossing the street. Still have a hard time reading a poem or telling a joke.
So, the key to AGI is the “G”; the generality is surprisingly elusive. And that’s the amazing thing, because that four-year-old that we were talking about has generality in spades, even though she’s not necessarily a great chess player or a great Go player. So that’s what we learned.
As our AI technology evolves, we keep learning about what is the most elusive aspect of AI. At first, if you read some of the stuff that was written in the ’60s and the ’70s, people were very skeptical that the program could ever play chess, because that was really seen as, very intelligent people are very good chess players.
And then, that became solved, and people talked about learning. They said, “Well, gosh, but programs can’t learn.” And as we’ve gotten better, at least at certain kinds of learning, now the emphasis is on generality, right? How do we build a general program, given that all of our successes, whether it’s poker or chess or certain kinds of question answering, have been on very narrow tasks?
So, one sentence I read about Aristo says, “The focus of the project is explained by the guiding philosophy that artificial intelligence is about having a mental model for how things operate, and refining that mental model based on new knowledge.” Can you break that down for us? What do you mean?
Well, I think, again, lots of things. But I think a key thing not to forget—and it goes from your favorite question about a nickel and the sun—is that so much of what we do makes use of background knowledge, just extensive knowledge of facts, of words, of all kinds of social nuances, etc., etc.
And the hottest thing going is deep learning methods. Deep learning methods are responsible for the success in Go, but the thing to remember is that often, at least by any classical definition, those programs are very knowledge-poor. If you could talk to them and ask them, “What do you know?” you’d find out that—while they may have stored a lot of implicit information, say, about the game of Go—they don’t know a whole heck of a lot. And that, of course, touches onto the topic of consciousness, which I understand is also covered in your book. If I asked AlphaGo, “Hey, did you know you won?” AlphaGo can’t answer that question. And it’s not because it doesn’t understand natural languages. It’s not conscious.
Kasparov said that about Deep Blue. He said, “Well, at least it can’t gloat. At least it doesn’t know that it beat me.” To that point, Claude Shannon wrote about computers playing chess back in the ’50s, but it was an enormous amount of work. It took the best minds a long time to build something that could beat Kasparov. Do you think that something like that is generalizable to a lot of other things? Or am I hearing you correctly that that is not a step towards anything general? That’s a whole different kind of thing, and therefore Aristo is, kind of, doing something very different than AlphaGo or chess, or Jeopardy?
I do think that we can generalize from that experience. But I think that generalization isn’t always the one that people make. So what we can generalize is that, when we have a very clear “objective function” or “performance criteria”—basically it’s very clear who won and who lost—and we have a lot of data, that as computer scientists we’re very, very good—and it still, as you mentioned, took decades—but we’re very, very good at continuing to chip away at that with faster computers, more data, more sophisticated algorithms, and ultimately solving the problem.
However, in the case of natural language: If you and I, let’s say we’re having a conversation here on this podcast—who won that conversation? Let’s say I want to do a better job if you ever invite me for another podcast. How do I do that? And if my method for getting better involves looking at literally millions of training examples, you’re not going to do millions of podcasts. Right?
So you’re right, that a very different thing needs to happen when things are vaguer, or more uncertain, or more nuanced, when there’s less training data, etc., etc.—all these characteristics that make Aristo and some of our other projects very, very different than chess or Go.
So, where is Aristo? Give me a question it can answer and a question it can’t. Or is that even a cogent question? Where are you with it?
First of all, we keep track of our scores. So, I can give you an example in a second. But when we look at what we call “non-diagram multiple choice”—questions that are purely in language, because diagrams can be challenging for the machine to interpret—we’ve been able to reach very close to eighty percent correctness. Eighty percent accuracy on non-diagram multiple choice questions for fourth grade.
When you say any questions, there we’re at sixty percent. Which is either great, because when we started—all these questions with diagrams and what’s called “direct answer questions,” where you had to answer them with a phrase or a sentence, you don’t just get to choose between four choices—we were close to twenty percent. We were far lower.
So, we’ve made a lot of progress, so that’s on the glass-half-full side. And the glass-half-empty side, we’re still getting a D on a fourth-grade science test. So it’s all a question of how you look at it. Now, when you ask, “What questions can we solve?” We actually have a demo on our website, on AllenAI.org, that illustrates some of these.
If I go to the Aristo project there, and I click on “live demo,” I see questions like, “What is the main source of energy for the water cycle?” Or even, “The diagram below shows a food chain. If the wheat plants died, the population of mice would likely _______?” So, these are fairly complex questions, right?
But they’re not paragraph-long, and the thing that we’re still struggling with is what we call “brittleness.” If you take any one of these questions that we can answer, and then change the way you ask the question a bit, all of a sudden we fail. This is, by the way, a characteristic of many AI systems, this notion of brittleness—where a small change that a human might say, “Oh, that’s no different at all.” It can make a big difference to the machine.
It’s true. I’ve been playing around with an Amazon Alexa, and I noticed that if I say, “How many countries are there?” it gives me one number. If I say, “How many countries are there in the world?” it gives me a different number. Even though a human would see that as the same question. Is that the sort of thing you’re talking about?
That’s exactly the sort of thing I’m talking about, and it’s very frustrating. And, by the way, Alexa and Siri, for the people who want to take the pulse of AI—I mean, again, we’re one of the largest nonprofit AI research institutes in the world, but we’re still pretty small at 72 people—Alexa or Siri, that’s for-profit companies; there are thousands of people working on those, and it’s still the case that you can’t carry on a halfway decent dialogue with these programs.
And I’m not talking about the cutesy answers about, you know, “Siri, what are you doing tonight?” Or, “Are you better than Alexa?” I’m talking about, let’s say, the kind of dialogue you’d have with a concierge of a hotel, to help you find a good restaurant downtown. And, again, it’s because how do you score dialogues? Right? Who won the dialogue? All those questions, that are very easy to solve in games, are not even really well-posed in the context of a dialogue.
I pinned an article about how—and I have to whisper her name, otherwise it will start talking to me—Alexa and Google Assistant give you different answers to factual questions.
So if you ask, “How many seconds are there in a year?” they give you different answers. And if you say, “Who designed the American flag?” they’ll give you different answers. Seconds in a year, you would think that’s an objective question, there’s a right and a wrong answer, but actually one gives you a calendar year, and one gives you a solar year, which is a quarter-day different.
And with the American flag, one says Betsy Ross, and the other one says the person who designed the 50-star configuration of the flag, which is our current flag. And in the end, both times those were the questioner’s fault, because the question itself is inherently vague, right? And so, even if the system is good, if the questions are poorly phrased, it still breaks, right? It’s still brittle.
I would say that it’s the computer’s fault. In other words, again, an aspect of intelligence is being able to answer vague questions and being able to explain yourself. But these systems, even if their fact store is enormous—and one day, they’ll certainly exceed ours—if all it can do when you say, “Well, why did you give me this number?” is say, “Well, I found it here,” then really it’s a big lookup table.
It’s not able to deal with the vagueness, or to explain itself in a more meaningful way. What if you put the number three in that table? You ask, “How many seconds are there in a year?” The program would happily say, “Three.” And you say, “Does that really make sense?” And it would say, “Oh, I can’t answer that question.” Right? Whereas a person, would say, “Wait a minute. It can’t be three seconds in a year. That just doesn’t make sense!” Right? So, we have such a long way to go.
Right. Well, let’s talk about that. You’re undoubtedly familiar with John Searle’s Chinese Room question, and I’ll set it up for the listener—because what I’m going to ask you is, is it possible for a computer to ever understand anything?
The setup, very briefly—I mean, I encourage people to look it up—is that there’s a person in a room and he doesn’t speak any Chinese, and he’s given Chinese questions, and he’s got all these books he can look it up in, but he just copies characters down and hands them back. And he doesn’t know if he’s talking about cholera or coffee beans or what have you. And the analogy is, obviously, that’s what a computer does. So can a computer actually understand anything?
You know, the Chinese Room thought experiment is really one of the most tantalizing and fun thought experiments in philosophy of mind. And so many articles have been written about it, arguing this, that or the other thing. In short, I think it does expose some of the issues, and the bottom line is when you look under the hood at this Chinese Room and the system there, you say, “Gosh, it sure seems like it doesn’t understand anything.”
And when you take a computer apart, you say, “Gosh, how could it understand? It’s just a bunch of circuits and wires and chips.” The only problem with that line of reasoning is, it turns out that if you look under the hood in a person’s mind—in other words, if you look at their brain—you see the same thing. You see neurons and ion potentials and chemical processes and neurotransmitters and hormones.
And when you look at it at that level, surely, neurons can’t understand anything either. I think, again, without getting to a whole other podcast on the Chinese Room, I think that it’s a fascinating thing to think about, but it’s a little bit misleading. Understanding is something that emerges from a complex technical system. That technical system could be built on top of neurons, or it could be built on top of circuits and chips. It’s an emergent phenomenon.
Well, then I would ask you, is it strong emergence or is it weak emergence? But, we’ve got three more projects to discuss. Let’s talk about Euclid.
Euclid is, really, a sibling of Aristo, and in Euclid we’re looking at SAT math problems. The Euclid problems are easier in the sense that you don’t need all this background knowledge to answer these pure math questions. You surely need a lot less of that. However, you really need to very fully and comprehensively understand the sentence. So, I’ll give you my favorite example.
This is a question that is based on a story about Ramanujan, the Indian number theorist. He said, “What’s the smallest number that’s the sum of two cubes in two different ways?” And the answer to that question is a particular number, which the listeners can look up on Google. But, to answer that correctly, you really have to fully parse that rather long and complicated sentence and understand “the sum of two cubes in two different ways.” What on earth does that mean?
And so, Euclid is working to have a full understanding of sentences and paragraphs, which are the kind of questions that we have on the SATs. Whereas often with Aristo—and certainly, you know, with things like Watson and Jeopardy—you could get away with a much more approximate understanding, “this question is sort of about this.” There’s no “sort of” when you’re dealing with math questions, and you have to give the answer.
And so that is, as you say, a sibling to Aristo; but Plato, the third one we’re going to discuss, is something very different, right?
Right. Maybe if we’re using this family metaphor, Plato is Aristo’s and Euclid’s cousin, and what’s going on there is we don’t have a natural benchmark test, but we’re very, very interested in vision. We’ve realized that a lot of the questions that we want to address, a lot of the knowledge that is present in the world isn’t expressed in text, certainly not in any convenient way.
One great way to learn about the sizes of things—not just the sun and a nickel, but maybe even a giraffe and a butterfly—is through pictures. You’re not going to find the sentence that says, “A giraffe is much bigger than a butterfly,” but if you see pictures of them, you can make that connection. Plato is about extracting knowledge from images, from videos, from diagrams, and being able to reason over that to draw conclusions.
So, Ali Farhadi, who leads that project and who shares his time between us and the Allen School at University of Washington, has done an amazing job generating result after result, where we’re able to do remarkable things based on images.
My favorite example of this—you kind of have to visualize it—imagine drawing a diagonal line and then a ball on top of that line. What’s going to happen to that ball? Well, if you can visualize it, of course the ball’s going to roll down the line—it’s going to roll downhill.
It turns out that most algorithms are actually really challenged to make that kind of prediction, because to make that kind of prediction, you have to actually reason about what’s going on. It’s not just enough to say, “There’s a ball here on a line,” but you have to understand that this is a slope, and that gravity is going to come into play, and predict what’s going to happen. So, we really have some of the state-of-the-art capabilities, in terms of reasoning over images and making predictions.
Isn’t video a whole different thing, because you’re really looking at the differences between images, or is it the same basic technology?
At a technical level, there are many differences. But actually, the elegant thing about video, as you intimated, a video is just a sequence of images. It’s really our eye, or our mind, that constructs the continuous motion. All it is, is a number of images shown per second. Well, for us, it’s a wonderful source of training data, because I can take the image at Second 1 and make a prediction about what’s going to happen in Second 2. And then I can look at what happened at Second 2, and see whether the prediction was correct or not. Did the ball roll down the hill? Did the butterfly land on the giraffe? So there’s a lot of commonalities, and video is actually a very rich source of images and training data.
One of the challenges with images is—well, let me give an example, then we can discuss it. Say I lived on a cul-de-sac, and the couple across the street were expecting—the woman is nine months pregnant—and one time I get up at three in the morning and I look out the window and their car is gone. I would say, “Aha, they must have gone to the hospital.” In other words, I’m reasoning from what’s not in the image. That would be really hard, wouldn’t it?
Yes. You’re way ahead of Plato. It’s very, very true.
To anticipate that you’ll go to Semantic Scholar; I want to make sure that we get to that. With Semantic Scholar, a number of the capabilities that we see in these other projects come together. Semantic Scholar is a scientific search engine, it’s available 24/7 at semanticscholar.org and it allows people to look for computer science papers, for neuroscience papers. Soon we’re going to be launching the ability to cover all the papers in biomedicine that are available on engines like PubMed.
And what we’re trying to do there is deal with the fact that there are so many, you know, over a hundred million scientific research papers, and more are coming out every day, and it’s virtually impossible for anybody to keep up. Our nickname for Semantic Scholar sometimes is Da Vinci, because we say Da Vinci was the last Renaissance man, right?
The person who, kind of, knew all of science. There are no Renaissance men or women anymore, because we just can’t keep up. And that’s a great place for AI to help us, to make scientists more efficient in their literature searches, more efficient in their abilities to generate hypotheses and design experiments.
That’s what we’re trying to do with Semantic Scholar, and that involves understanding language, and that involves understanding images and diagrams, and it involves a lot more.
Why do you think the semantic web hasn’t taken off more, and what is your prediction about the semantic web?
I think it’s important to distinguish between “semantics,” as we use it at Semantic Scholar, and “semantic” in the semantic web. In Semantic Scholar, we try to associate semantic information with text. For example, this paper is about a particular brain region, or this paper uses fMRI methodology, etc. It’s pretty simple semantic distinctions.
The semantic web was a very rich notion of semantics that, frankly, is superhuman and is way, way, way beyond what we can do in a distributed world. So that vision by Tim Berners-Lee really evolved over the years into something called “linked open data,” where, again, the semantics is very simple and the emphasis is much more about different players on the web linking their data together.
I think that very, very few people are working on the original notion of the semantic web, because it’s just way too hard.
I’m just curious, this is a somewhat frivolous question: But the names of your projects don’t seem to follow an overarching naming scheme. Is that because they were created and named elsewhere or what?
Well, it’s because, you know, if you let a computer scientist, which is me, if you put him or her in charge of branding, you’re going to run into problems. So, I think, Aristo and Euclid are what we started with and those were roughly analogous. Then we added Plato, which is an imperfect name, but still roughly in the mythological world. And then Semantic Scholar really is a play off of Google Scholar.
So Semantic Scholar is, if you will, really the odd duck here. And when we had a project, we were considering doing work on dialogue—which we still are—we called that project Socrates. But then I’m also thinking “Do we really want all the projects to be named after men?” which is definitely not our intent. So, I think the bottom line is it’s an imperfect naming scheme and it’s all my fault.
So, the mission of the Allen Institute for AI is, quote: “Our mission is to contribute to humanity through high-impact AI research and engineering.” Talk to me about the “contribute to humanity” part of that. What do you envision? What do you hope comes of all of this?
Sure. So, I think that when we started, we realized that so often AI is either vilified—particularly in Hollywood films, but also by folks like Stephen Hawking and Elon Musk—and we wanted to emphasize AI for the common good, AI for humanity, where we saw some real benefits to it.
And also, in a lot of for-profit companies, AI is used to target advertising, or to get you to buy more things, or to violate your privacy, if it’s being used by intelligence agencies or by aggressive marketing. And we really wanted to find places like Semantic Scholar, where AI can help solve some of humanity’s thorniest problems by helping scientists.
And so, that’s where it comes from; it’s a contrast to these other, either more negative uses, or more negative views of AI. And we’ve been really pleased that, since we were founded, organizations like OpenAI or the Partnership on AI, which is an industry consortium, have adopted missions that are very consistent and kind of echo ours, you know: AI to benefit humanity and society and things like that. So it seems like more and more of us in the field are really focused on using AI for good.
You mentioned fear of AI, and the fear manifests—and you can understand Hollywood, I mean, it’s drama, right—but the fear manifests in two different ways. One is what you alluded to, that it’s somehow bad, you know, Terminator or what have you. But the other one that is on everybody’s mind is, what do you think about AI’s effect on employment and jobs?
I think that’s a very serious concern. As you can tell, I’m not a big fan of the doomsday scenarios about AI. I tell people we should not confuse science with science fiction. But another reason why we shouldn’t concern ourselves with Skynet and doomsday scenarios is because we have a lot more realistic and pressing problems to worry about. And that, for example, is AIs impact on jobs. That’s a very real concern.
We’ll see it in the transportation sector, I predict, particularly soon. Where truck drivers and Uber drivers and so on are going to be gradually squeezed out of the market, and that’s a very significant number of workers. And it’s a challenge, of course, to help these people to retrain them, to help them find other jobs in an increasingly digital economy.
But, you know, in the history of the United States, at least, over the past couple of hundred years, there have been a number of really disruptive technologies that have come along—the electrification of industry, the mechanization of industry, the replacement of animal power, going into steam—things that really impacted quickly, and yet unemployment never once budged because of that. Because what happens is, people just use the new technology. And isn’t it at least possible that, as we move along with the development of artificial intelligence, that it actually is an empowering technology that lets people use it to increase their own productivity? Like, anybody could use it to increase their productivity.
I do think that AI will have that role, and I do think that, as you intimated, these technological forces have some real positives. So, the reason that we have phones and cars and washing machines and modern medicine, all these things that make our lives better and that are broadly shared through society, is because of technological advances. So I don’t think of these technological advances, including AI advances, as either a) negative; or b) avoidable.
If we say, “Okay, we’re not going to have AI,” or “We’re not going to have computers,” well, other countries will and they’ll overtake us. I think that it’s very, very difficult, if not impossible to stop broad-based technology change. Narrow technologies that are particularly terrible, like landmines or biological weapons, we’ve been able to stop. But I think AI isn’t stoppable because it’s much broader, and it’s not something that should be stopped, it’s not like that.
So I very much agree with what you said, but with one key caveat. We survived those things and we emerged thriving, but the disruption over significant periods of time and for millions of people was very, very difficult. So right as we went from a society that was ninety-something percent agricultural to one where there were only two percent workers in agriculture—people suffered and people were unemployed. And so, I do think that we need to have the programs in place to help people with these transitions.
And I don’t think that they’re simple because some people say, “Sure, those old jobs went away, but look at all these great jobs. You know, web developer, computer programmer, somebody who leverages these technologies to make themselves more effective at their jobs.” That’s true, but the reality is a lot more complicated. Are all these truck drivers really going to become web developers?
Well, I don’t think that’s the argument, right? The argument is that everybody moves one small notch up. So somebody who was a math teacher in a college, maybe becomes a web developer, and a high school teacher becomes the college teacher, and then a substitute teacher gets the full time job.
Nobody says, “Oh, no, no, we’re going to take these people, you know, who have less training and we’re going to put them in these highly technical jobs.” That’s not what happened in the past either, right? The question is can everybody do a job a little more complicated than the one they have today? And if the answer to that is yes, then do we have a big disruption coming?
Well, first of all, you’re making a fair point. I was oversimplifying by mapping the truck drivers to the developers. But, at the same time, I think we need to remember that these changes are very disruptive. And, so, the easiest example to give, because it’s fresh in my mind and, I think, other people’s mind—let’s look at Detroit. This isn’t technological changes, it’s more due to globalization and to the shifting of manufacturing jobs out of the US.
But nevertheless, these people didn’t just each take a little step up or a little step to the right, whatever you want to say. These people and their families suffered tremendously. And it’s had very significant ramifications, including Detroit going bankrupt, including many people losing their health care, including the vote for President Trump. So I think if you think on a twenty-year time scale, will the negative changes be offset by positive changes? Yes, to a large extent. But if you think on shorter time scales, and you think about particular populations, I don’t think we can just say, “Hey, it’s going to all be alright.” I think we have a lot of work to do.
Well, I’m with you there, and if there’s anything that I think we can take comfort in, it’s that the country did that before. There used to be a debate in the country about whether post-literacy education was worth it. This was back when we were an agricultural society. And you can understand the logic, right? “Well once somebody learns to read, why do you need to keep them in school?” And then, people said, “Well, the jobs of the future are going to need a lot more skills.” That’s why the United States became the first country in the world to guarantee a high school education to every single person.
And it sounds like you’re saying something like that, where we need to make sure that our education opportunities stay in sync with the requirements of the jobs we’re creating.
Absolutely. I think we are agreeing that there’s a tremendous potential for this to be positive, you know? Some people, again, have a doomsday scenario for jobs and society. And I agree with you a hundred percent; I don’t buy into that. And it sounds like we also agree, though, that there are things that we could do to make these transitions smoother and easier on large segments of society.
And it definitely has to do with improving education and finding opportunities etc., etc. So, I think it’s really a question of how painful will this change be, and how long will it take until we’re at a new equilibrium that, by the way, could be a fantastic one? Because, you know, the interesting thing about the truck jobs, and the toll jobs that went away, and a lot of other jobs that went away; some of these jobs are awful. They’re terrible, right? People aren’t excited about a lot of these jobs. They do them because they don’t have something better. If we can offer them something better, then the world will be a better place.
Absolutely. So we’ve talked about AGI. I assume you think that we’ll eventually build a general intelligence.
I do think so. I think it will easily take more than twenty-five years, it could take as long as a thousand years, but I’m what’s called a materialist; which doesn’t mean that I like to shop on Amazon; it means that I believe that when you get down to it, we’re constructed out of atoms and molecules, and there’s nothing magical about intelligence. Sorry—there’s something tremendously magical about it, but there’s nothing ineffable about it. And, so, I think that, ultimately, we will build computer programs that can do and exceed what we can do.
So, by extension, you believe that we’ll build conscious machines as well?
Yes. I think consciousness emerges from it. I don’t think there’s anything uniquely human or biological about consciousness.
The range of time that people think it will be before we create an AGI, in my personal conversations, range from five to five hundred years. Where in that spectrum would you cast your ballot?
Well, I would give anyone a thousand-to-one odds that it won’t happen in the next five years. I’ll bet ten dollars against ten thousand dollars, because I’m in the trenches working on these problems right now and we are just so, so far from anything remotely resembling an AGI. And I don’t know anybody in the field who would say or think otherwise.
I know there are some, you know, so-called futurists or what have you… But people actively working on AI don’t see that. And furthermore, even if somebody says some random thing, then I would ask them, “Back it up with data.” What’s your basis for saying that? Look at our progress rates on specific benchmarks and challenges; they’re very promising but they’re very promising for a very narrow task, like object detection or speech recognition or language understanding etc., etc.
Now, when you go beyond ten, twenty, thirty years, who can predict what will happen? So I’m very comfortable saying it won’t happen in the next twenty-five years, and I think that it is extremely difficult to predict beyond that, whether it’s fifty or a hundred or more, I couldn’t tell you.
So, do you think we have all the parts we need to build an AGI? Is it going to take some breakthrough that we can’t even fathom right now? Or with enough deep learning and faster processors and better algorithms and more data, could you say we are on a path to it now? Or is your sole reason for believing we’re going to build an AGI that you’re a materialist—you know, we’re made of atoms, we can build something made of atoms.
I think it’s going to require multiple breakthroughs which are very difficult to imagine today. And let me give you a pretty concrete example of that.
We want to take the information that’s in text and images and videos and all that, and represent that internally using a representation language that captures the meaning, the gist of it, like a listener to this podcast has kind of a gist of what we’ve talked about. We don’t even know what that language looks like. We have various representational languages, none of them are equal to the task.
Let me give you another way to think about it as a thought experiment. Let’s suppose I was able to give you a computer, a computer that was as fast as I wanted, with as much memory as I wanted. Using that unbelievable computer, would I now be able to construct an artificial intelligence that’s human-level? The answer is, “No.” And it’s not about me. None of us can.
So, if it was really about just the speed and so on, then I would be a lot more optimistic about doing it in a short term, because we’re so good at making it run two times faster, making it run ten times faster, building a faster computer, storing information. We used to store it on floppy disk, and now we store it here. Next we’re going to be storing it in DNA. This exponential march of technology under Moore’s Law—keep getting faster and cheaper—in that sense, is phenomenal. But that’s not enough to achieve AGI.
Earlier you said that you tell people not to get confused with science and science fiction. But, about science fiction, is there anything that you’ve seen, read, watched that you actually think is a realistic scenario of what we may be able to do, what the future may hold? Is there anything that you look at and say, well, it’s fiction, but it’s possible?
You know, one of my favorite pieces of fiction is the book Snow Crash, where it, kind of, sketches this future of Facebook and the future of our society and so on. If I were to recommend one book, it would be that. I think a lot of the books about AI are long on science fiction and short on what you call “hard science fiction”; short on reality.
And if we’re talking about science fiction, I’d love to end with a note where, you know, there’s this famous Arthur C. Clarke quote, “Any sufficiently advanced technology is indistinguishable from magic.” So, I think, to a lot of people AI seems like magic, right? We can beat the world champion in Go—and my message to people, again, as somebody who works in the field day in and day out, it couldn’t be further from magic.
It’s blood, sweat, and tears—and, by the way, human blood, sweat and tears—of really talented people, to achieve the limited successes that we’ve had in AI. And AlphaGo, by the way, is the ultimate illustration of that. Because it’s not that AlphaGo defeated Lee Sedol, or the machine defeated the human. It’s this remarkably-talented team of engineers and scientists at Google, working at Google DeepMind, working for years; they’re the ones who defeated Lee Sedol, with some help from technology.
Alright. Well, that’s a great place to leave it, and I want to thank you so much. It’s been fascinating.
It’s a real pleasure for me, and I look forward both to listening to this podcast, to your other ones, and to reading your book.
Thank you.
Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here

Voices in AI – Episode 1: A Conversation with Yoshua Bengio

In this episode Byron and Yoshua talk about knowledge, unsupervised learning, how the brain learns, creativity, and machine translation.
[podcast_player name=”Episode 1: A Conversation with Yoshua Bengio” artist=”Byron Reese” album=”Voices in AI” url=”https://voicesinai.s3.amazonaws.com/2017-09-28-(00-56-04)-yoshua-bengio.mp3″ cover_art_url=”https://voicesinai.com/wp-content/uploads/2017/09/voices-headshot-card.jpg”]


Yoshua Bengio received a PhD in Computer Science from McGill University in Canada in 1991. After two post-doctoral years, one at MITand one at AT&T Bell Labs, he became professor at the Department of Computer Science and Operations Research at the University of Montreal. He is the author of two books and more than 200 publications. The most cited being in the areas of deep learning, recurrent neural networks, probabilistic learning algorithms, natural language processing and manifold learning. He is among the most cited Canadian computer scientists and is or has been Associate Editor of the top journals in machine learning and neural networks.


Byron Reese: This is Voice in AI, brought to you by Gigaom. I’m Byron Reese. Today our guest is Yoshua Bengio. Yoshua Bengio received a PhD in Computer Science from McGill University in Canada in 1991. After two post-doctoral years, one at MIT and one at AT&T Bell Labs, he became professor at the Department of Computer Science and Operations Research at the University of Montreal. He is the author of two books and more than two hundred publications. The most cited being in the areas of deep learning, recurrent neural networks, probabilistic learning algorithms, natural language processing and manifold learning. He is among the most cited Canadian computer scientists and is or has been Associate Editor of the top journals in machine learning and neural networks. Welcome to the show, Yoshua.
Yoshua Bengio: Thank you.
So, let’s begin. When people ask you, “What is artificial intelligence,” how do you answer that?
Artificial intelligence is looking for building machines that are intelligent, that can do things that humans can do, and for doing that it needs to have knowledge about the world and then be able to use that knowledge to do useful things.
And it’s kind of kicking the can down the street just a little bit, because there’s unfortunately no consensus definition of what intelligence is either, but it sounds like the way you describe it, it’s just kind of like doing complicated things. So, it doesn’t have an aspect of, you know, it has to respond to its environment or anything like that?
Not necessarily. You could imagine having a very, very intelligent search engine that understands all kinds of things but doesn’t really have a body, doesn’t really live in an environment other than the interactions with people through the queries. So, the kinds of intelligence, of course, that we know and we think about when we think about animals are involving movement and actions and so on. But, yeah. Intelligence could be of different forms and it could be about different aspects. So, a mouse is very intelligent in its environment. If you or I went into the head of the mouse and tried to control the muscles of the mouse and survive, we probably wouldn’t last very long. And if you understand my definition, which is about knowledge, you could know a lot of things in some areas so you could be very intelligent in some area and know very little about another area and so not be very intelligent in other areas.
And how would you describe the state of the art? Where are we with artificial intelligence?
Well, we’ve made huge progress in the ability of computers to perceive better, to understand images, sounds and even to some extent language. But, we’re still very far from machines that can discover autonomously how the world around us works. We’re still very far from machines who can understand sort of high-level concepts that we typically manipulate with language. So, there’s a lot to do.
And, yeah, it’s true. Like, if you go to any of the bots that have been running for years, the ones that people built to maybe try to pass the Turing test or something, if you just start off by asking the question, “What’s bigger, a nickel or the sun”, I have yet to ever find one that can answer that question. Why do you think that is? What’s going on there? Why is that a hard question?
Because it’s what people call common sense. And really it refers to a general understanding of how the world around us works, at least from the point of view of humans. All of us have this kind of common sense knowledge. It’s not things that you find in books, typically. At least not explicitly. I mean, you might find some of it implicitly, but you don’t get like in Wikipedia, you don’t get that knowledge typically. That’s knowledge we pick up as children and that’s the kind of knowledge also that sometimes is intuitive. Meaning that we know how to use it and we can recognize a thing, like we can recognize a chair, but we can’t really describe formally—in other words with a few equations—what a chair is. We think we can, but when we’re actually pressed to do it, we’re not able to do a good job at that. And the same thing is true for example in the case of the AlphaGo system that beat the world champion at the game of Go, it can use a form of intuition to look at the game state and decide, “What would be a good move next?” And humans can do the same thing without necessarily being able to decompose that into a very clear and crisp explanation. Like, the way it was done for chess. In the case of chess, we have a program that actually tries many moves. Like, “If I do this, then what’s the worst thing for me that could happen? The other guy does that, and then I could do this, and then the other guy does that, and then I could do this.” So, this is a very crisp, logical explanation of why the computer does this.
In the case of neural nets, well, we have a huge network of these artificial neurons with millions or hundreds of millions of parameters, of numbers that are combined. And, of course, you can write an equation for this, but the equation will have so many numbers in it that it’s not humanly possible to really have a total explanation of why he’s doing this. And the same thing with us. If you ask a human, “Why did you take that decision,” they might come up with a story, but for the most part, there are many aspects of it that they can’t really explain. And that is why the whole classical AI program based on expert systems where humans would download their knowledge into the computer by writing down what they know, it failed. It failed because a lot of the things we know are intuitive. So, the only solution we found is that computers are going to learn that intuitive knowledge by observing how humans do it, or by observing the world, and that’s what machine learning is about.
So, when you take, you know, a problem like “a nickel or the sun, which is bigger?” or those kind of common sense problems, do we know how to solve them all? Do we know how to make a computer with common sense and we just haven’t done it? We don’t have all the algorithms done, we don’t have the processing power and all of that? Or do we kind of not know how to get that bit of juice, or magic into it?
No, we don’t know. We know how to solve simpler problems that are related, and different researchers may have different plans for getting there, but it’s still an open problem and it’s still research about how do we put things like common sense into computers. One of the important ingredients that many researchers in my area believe is that we need better unsupervised learning. So, unsupervised learning is when the computer learns without being told what it should be doing. So, when the computer learns by observation or by interacting with the world, but it’s not like supervised learning, where we tell the computer, “For this case you should do this.” You know, “The human player in this position played that move. And this other position, the human player played that move.” And you just learn to imitate. Or, you have a human driving a car, and the computer just learns to do the same kinds of moves as the driver would do in those same circumstances. This is called supervised learning. Another example to discriminate between supervised and unsupervised is, let’s say you’re learning in school and your professor gives you exercises and at the end of each exercise, your professor tells you what the right answer was. So now, you can, you know, train yourself through many, many exercises and this is supervised learning. And, it’s hard, but we’re pretty good at it right now. Unsupervised learning would be you go and you read books and you try things for yourself in the world and from that you figure out how to answer questions. That’s unsupervised learning and humans are very good at it. An example of this is what’s called intuitive physics. So, if you look at a two-year-old child, she knows physics. Of course, she doesn’t know Newtonian physics. She doesn’t know the equations of physics. But, she knows that when she drops a ball, it falls down. She knows, you know, the notions of solid objects and liquid objects and all this, pressure, and all this and she’s never been told about it. Her parents don’t keep telling her, “Oh, you know, you should use this differential equation and blah blah blah.” No, they tell her about other things. She learns all of this completely autonomously.
Do you use that example as an analogy or do you think there are things we can learn from how children acquire knowledge that are going to be really beneficial to building systems that can do unsupervised learning?
Both. So, it is clearly an analogy and we can take it as such. We can generalize through other scenarios. But, it’s also true that, at least some of us in the scientific community for AI, are looking at how humans are doing things, are looking at how children are learning, are looking at how our brain works. In other words, getting inspiration from the form of intelligence that we know that exists. We don’t completely understand it, but we know it’s there and we can observe it and we can study it. And scientists in biology and psychology have been studying it for decades, of course.
And similarly, you know, just thinking out loud, we also have neural nets which, again, appeal to the brain.
Have we learned everything we think we’re going to learn from the brain that’s going to help us build intelligent computers? Or are they really just like absolutely nothing in common. They’re like very different systems.
Well, the whole area of deep learning, which is so successful these days, is just the modern form of neural nets. Which, neural nets have been around for decades. Since the ’50s. And they are, of course, inspired by things we knew about the brain. And now we know more and, actually, some elements of brain computation have been imported in neural nets fairly recently. Like in 2011, we introduced our rectification units in deep neural nets and showed that they help to train deeper networks better. And actually, the inspiration for this was the form of the computation of the nonlinearities that are present in actual brain neurons. So, we continue to look at the brain as potential sources of inspiration. Now, that being said, we don’t understand the brain. Biologists, neuroscientists know a lot of things, have made tons of observations, but they still don’t have anywhere near the big picture of how the brain works, and most importantly for me, how the brain learns. Because this is the part that really we need to import in our machine learning systems.
And so, looking out a ways, we talked about common sense. Do you think we’re eventually going to build an AGI, an artificial general intelligence, that is as versatile or more so than a human?
You know, when you talk to people and you say, “When will we have that?” the range that I’ve heard is five to five hundred years. First of all, that’s two orders of magnitude difference. Why do you think there’s such disagreement on when we’ll have it, and if you were then to throw your name in the hat and put a prediction out there, when would you think we would get an AGI?
Well, I don’t have a crystal ball and really you shouldn’t be asking those questions.
Well, there’s so much uncertainty.
So, there is a nice image here to illustrate why it’s impossible to answer this question. It’s like if you are climbing up a mountain range and right now you’re on this mountain and it looks like it’s the biggest mountain around and if you really want to get to the top, you have to reach the top of that mountain. But, really, you don’t see behind that mountain what is looming, and it’s very likely that after you reach the top of that mountain, we might see another one that’s even higher. And so we…you know, we have more work to do. So, right now it looks like we’ve made a lot of progress on this particular mountain which allows us to do so well in terms of perception, at least at some level. But, higher level, we’re still making baby steps, really. And we don’t know if the tools that we currently have, the concepts that we currently have with some incremental work will get us there. Or…and that might then happen in maybe ten years, right? With enough computing power. Or, if we’re going to face some other obstacle that we can’t foresee right now which we could be stuck with for a hundred years. So, that’s why giving numbers I think is not informing us very much.
Fair enough. And if you’ll indulge me with one more highly speculative question, I’ll return to the here and now and practical uses. So, my other highly speculative question is, “Do you think we’re going to build conscious machines ever?” Like, can we do that? And in your mind, is consciousness—the ability to perceive and self-awareness and all that comes with consciousness—is that required to build a general intelligence? Or is that a completely unrelated thing?
Well, it depends on the kind of intelligence that we want to build. So, I think you could easily have some form of intelligence without consciousness. As I mentioned, imagine a really, really smart encyclopedia that can answer any of your questions but doesn’t have any sense of self. But if we want to build things like robots, we’ll probably need to give them some sense of self. Like, a robot needs to know where it is, and how it stands compared to other objects or agents. It needs to know things like if it gets damaged, so it needs to know how to avoid being damaged. It’s going to have some kind of primitive emotions in the sense that, you know, it’s going to have some way of knowing that it’s reaching its objectives or not, and, you know, you can think of this as being happy or whatever. So, the ingredients of consciousness are already there in systems that we are building. But, they’re just very, very primitive forms. So, consciousness is not like a black and white thing. And it’s not even something that people agree on, even less than what intelligence is I think, what consciousness is. But my understanding of it is that we’ll build machines that have more and more of a form of consciousness as needed for the application that we’re building them for.
Our audiences are largely business people and, you know, they constantly read about new breakthroughs every day in the artificial intelligence space. How do you advise people to discover, to spot problems that artificial intelligence would be really good at chewing up and, you know, really getting your hands around? Like, if I were to walk around my company and I go from department to department to department—I go to HR, then I go to marketing, then I go to product, then I go to all of them, everyone—how do you spot things where AI might be a really good thing to deploy to solve?
Okay. So, it depends on your time horizon. So, if you want to use today’s science and technology and apply it to current tasks, it’s different from saying, “Oh, I imagine this particular service or product which could come out in five years from now.”
The former. What can you do today?
Yes. Okay. So, today, then things are pretty clear. You need a very well-defined task for which you have a lot of examples of what the right behavior of the computer should be. And a lot could be millions. It depends on the complexity of the task. If you’re trying to learn something very simple, then maybe you need less. And if you need to learn something more complicated… For example, when we do machine translation, you would easily have hundreds of millions of examples. You need to have a well-defined task in the sense that the situation where the computer is going to be used, we know what information it would have. This is the input. And we know what kind of decision it’s supposed to make. This is the output. And, we’re doing supervised learning, which is the thing we’re doing really well now. So, in other words, we can give the computer examples of, “Well, for these inputs, this is the output that it should’ve produced.” So, that’s one element. Another element is, well, not all of the tasks like this are easy to learn by current AI with deep learning. Some things are easier to learn. In particular, things that humans are good at are more likely to be easier to learn. Things that are fairly limited in scope. Because in a sense that makes them easier. That’s also going to tend to be easier to learn. Things that if you were able to solve this problem, then you must have really good common sense and you must be basically intelligent, well, you’re probably not going to be able to do that well because we haven’t solved AI yet. So, these are some things you can look for.
You’ve mentioned games earlier and I guess there’s a long history of using artificial intelligence to play games. I mean, it goes back to Claude Shannon writing about chess. IBM, in the ’50s, had a computer to play checkers. Everybody knows the litany, right? You had Deep Blue and you had chess and then you had Jeopardy and then you had AlphaGo and then you had poker recently. And I guess games work well because there’s a very defined rule set and it’s a very self-contained universe. What would be… You know, everybody talks about Go as this one that would be very hard to do. What is the next game that you think that you’re going to see computers have a breakthrough in and people are going to be scratching their heads and marveling at that one?
So, there are some more complex video games that researchers are looking at. Which involve virtual worlds that are much more complex than the kind of simple grid world in which you live in the case of Go. And also, there’s something about Go and chess which is not realistic for many situations. In Go and chess, you can see the whole state of the world, right? It’s the positions of all the pieces. In a more realistic game or in more real world, of course, the agent doesn’t know everything that there is to know about the state of the world. It only sees parts of it. There is also the question of the kind of data we can get. So, one problem with games, but also with other applications like dialogue is that it really isn’t the case that you can give me a set of data that I can extract that data once and for all, from, maybe, asking a lot of humans to perform particular tasks and then just learn by imitation. The reason this doesn’t work is because when the learning machine is going to be playing using its own strategy, it may do things differently than how humans have been doing it. So, if we talk about dialogue, maybe our dialogue machine is going to make mistakes that no human would ever do, and so the dialogue would then move into a sort of configuration that has never been seen when you look at people talking to each other. And now the computer doesn’t know what to do, because it’s not part of what it’s been trained with. So, what it means is that the computer needs to be interacting with the environment. That’s why games that are simulated are so interesting for researchers. Because we have a simulator in which the computer can sort of practice what the effect of its actions would be, and depending on what it does, you know, what is it going to observe and so on.
So, there’s a fair amount of worry and consternation around artificial intelligence. Specifically, with regard to jobs and automation.
Since that’s closer to the near future, what do you think is going to happen?
It’s very probable that there’s going to be a difficult transition in the job market. According to some studies, something like half of the jobs will be impacted seriously. That means a lot of jobs will go. Everybody doing that job may not necessarily go, but their job description might change because a lot of what they were doing, which was sort of routine, will be done by computers, and then we’ll need less people for doing the same work. At the same time, I think, eventually there will be new jobs created, and there should not be, really, unemployment because we still want to have humans doing some things that we don’t want computers to do. I don’t want my babies to be taken care of by a machine. I want a human to interact with my babies. I want a human to interact with my old parents. So, all the caring jobs, all the teaching jobs, I mean, to some extent even though computers will have an important role, I think we would be happy to have, instead of classes of thirty students, classes of five students, or classes of two students. I mean, there’s no limit to how much humans can help each other. And right now, we can’t because we don’t have, you know, it would be too costly. But once the jobs that can be automated are automated, well, those human to human jobs I think will become the norm. And also all the jobs that are more creative, require less routine and things like of course artists or even scientists, hopefully, we’ll probably want to have more of these people.
Now, that being said, there’s going to be a transition, I think, where there’s going to be a lot of people losing their jobs and they’re not going to be having the right training to do something else that, you know, other jobs that are going to be opening up. And so, we have to set up the right social security to take care of these people, maybe with guaranteed minimum income or something else, but somehow, we have to think about that transition because it could have big political impact. If you think about the transition that happened with the industrial revolution, from agriculture to industry and all the human misery that happened say between the middle of the nineteenth century to the middle of the twentieth century…well, a lot of that could have been avoided if we had put in place the kind of social measures that we did finally put in place around the Second World War. So, similarly, we need to think a little bit about what would be the right ways to handle the transition to minimize human suffering? And there’s going to be enough wealth to do it, because AI is going to create a lot of wealth. A lot of new product services doing it more efficiently, so thus, you know, in a sense, globally we’re all going to get richer. But the question is, where is that money going to go? We have to make sure that some of it goes to help that transition from the human point of view.
You kind of get people to fall into three camps on this question. One says, “You know, there will be a point where a computer can learn a new task faster than a human. And when that happens, that’s this kind of tipping point where they will do everything. They’ll do every single thing a human can do.” So, you’re essentially a school of thought that says you’re going to lose basically all of the jobs. All of them could be done by a machine. Then you get people who say, “Well, there’s going to be some amount of displacement,” and they often appeal to things like The Great Depression. To say, “There are certain people that are going to lose their jobs and then they’re not going to have training to find new ones in this new economy.” And then finally you come to people who say, “Look. This is an old, tired song. Unemployment has been between four and nine percent in the West for three hundred years, two hundred and fifty years. You can mechanize industry, you can eliminate farming, you can bring electricity in, you can go to coal power, you can create steam — you can do these amazingly disruptive things and you never even see a twitch in unemployment numbers. None. Nothing. Four to nine percent. So, that is certainly kind of the historical fact, and that view says you’re not going to have any more unemployment than you do now. New jobs will be made as quickly as the old ones are eliminated. So, anybody who holds one of the other two positions from that one, it’s incumbent on them to say why they think this time is different. And they always have a reason they think this time is different. And it sounds like you think we’re going to have a lot of job turnover, it’s going to be disruptive enough that we may need a basic income. There’s going to be this big mismatch between people’s skills and what the economy needs. So, it sounds like a pretty tumultuous time you’re seeing.
And so, what do you think… If they say, “Well, what’s different this time? Why is it going to be different than say, bringing electricity to industry, or bringing mechanization, replacing animal power with machine power?” I mean, that has to be of the same kind of order as what we’re talking about. Or does it?
So, we’re talking about a different kind of automation. And we’re talking about a different speed at which it’s going to happen. So, the traditional automation is replacing human, physical power and potentially skill, but in very routine ways. The new automation that’s starting is able to deal with a lot of kinds of tasks. And when we were doing the transition from, you know, due to the automation, for example, say in the auto industry, so of these fairly… Or even the agricultural industry… The automation of those rather labor, physical, intensive tasks to the current situation where many of these are automated, people could migrate to the white collar jobs and the service industry. So, now, it’s less clear where the migration will be. I think there will be a migration, as I said, to jobs that involve more human interaction and more creativity than what machines will be able to do for a while. But the other factor is the speed. I think it’s going to happen much faster than it has happened in the past. And so that means people won’t have time to go to the end of their retirement, and their job is not replaced. They’re going to lose their jobs in their 30s or 40s or 50s, and, of course, that could create a lot of havoc.
The number one question that I am asked when I speak on this topic, far and away the number one question, is, “What should my children study today so that they will be employable in fifty years?” It sounds like your answer to that is, “Things that require some kind of an emotional attachment and things that require some amount of creativity.” Are there other categories of jobs you would throw into that bucket or not?
Well, obviously, those computers have to be built by some people, so we need scientists, programmers and so on. That’s going to continue for a while. But that’s a small portion of the population. I think, for those who can, scientific jobs and engineering and computer-related jobs, we’re going to continue to need more and more of these. That’s not going to stop anytime soon. And, as you said, I think the human-to-human jobs, we’re going to need more. We’re going to want more. So, basically, what’s going to happen is, we’re going to have all this extra, I mean, some people are going to have extra wealth coming from this. Maybe, you know, you work for a company. You work for Google and you have this big salary, and now you can use this money to send your kids to a school that has classes of size five instead of thirty.
You know, coupled with artificial intelligence, always what gets grouped in with that is the discussion of robots. So that, you kind of have both the mind and the body which technology is kind of replacing. Robots seem to move at a much slower rate. I mean, if they had a Moore’s Law, they’re doubling every, you know, it’s more than two years. Do you have any thoughts on the marriage of robots with artificial intelligence and are robots needed for… Do AIs need to be embodied to learn better and things like that? Or are those just apples and oranges. They have nothing to do with each other?
Oh, they have things to do with each other. So, I do believe that you could have intelligence without a body. But, I also believe that having a body, or some equivalent of a body, as I’ll explain later, might be an important ingredient to reach that intelligence. So, I think a lot of things that we learn by interacting with the world. You don’t see me right now, but I’m picking up a glass and I’m looking at it from different angles, and if I had never seen this glass, this manipulation could teach me a lot about it. So, I think the idea of robots interacting with the environment is important. Now, robots themselves with legs and arms, I expect that the progress is going to be slower than with virtual intelligence. Because, you know, robots…the research cycle is slower; you build them, you program them, you try them for real. More importantly, it takes time for the robot to interact, and one robot can only learn so much. But if you have a bot, in other words, an intelligence that goes on the web and maybe interacts with people, well, it can interact with millions or even billions of people. Because it can have many copies of itself running. And so, it can have interactions, but they’re not physical interactions. They’re virtual interactions and it can learn from a lot of data, because there’s a lot of data out there. And, you know, everything on the web. So, there is an opportunity, I would bet that we’re going to see progress in AI go faster with those virtual robots than with the real, physical robots. But eventually, I think we’ll get those as well, and it’s just going to be at a different pace.
One of the areas that you mentioned that you’re deeply interested in is natural language processing and, you know, to this day, whenever I call my airline of choice and I have to say my membership number… It’s got an 8 in it, and if I’m on my headset, it never gets whether it’s an 8 an H or an A. So, I always have to unplug my headset and say it into the phone and all of that. And yet, I interface with other systems, like Amazon Alexa or Google Assistant and all of that, that seem to understand entire paragraphs and sentences and can capitalize the right things and so forth. What am I experiencing there, those two very different experiences? Is it because in the first case there’s no context, and so it really doesn’t know how to guess between 8 and H and A?
So, right now, the systems that are in place are not very smart. And some systems are not smart at all. Machine learning methods are only starting to be used in those deployed systems and they still only are used for parts of the system. That’s also true, by the way, of self-driving cars right now. That the system is designed more or less by hand, but some parts, like say recognizing pedestrians, or in the case of language, maybe parsing or identifying who you’re talking about just by the name; these jobs are done by separately trained modules that are trained by supervised learning. So that’s the typical scenario right now. The current state of the art with deep learning in terms of language understanding allows those systems to get a pretty good sense of what you’re talking about in terms of the topics and even what you’re referring to. But, they’re still not very good at making what we consider something like rational inferences and reasoning on top of those things. So, something like machine translation, actually, has made a huge amount of progress, in part, due to the things we’ve done in my lab, where you can get the computer to understand pretty well what the sentence is about and then use, you know, the specifics of the words that are being used to define a good translation. But, you could still fail in cases where there are complicated semantic ambiguities. But those don’t come up very often when you do translation. However, they would come up in tasks like, say the kinds of exams that students pass where they read a text and then they have to answer questions about it. So, there are still things that we’re not very good at, which involve high level understanding and analogies.
You mentioned that you were bullish on jobs that required human creativity. And I’ve always been kind of surprised by the number of researchers in artificial intelligence who kind of shrug creativity off. They don’t think there’s anything particularly special or interesting about it and think that computers will be creative sooner than they’ll be able to do other things that seem more mundane. What are your thoughts on human creativity?
So actually, I’ve been working on creativity, and we call it a different name. In my field, we call it generative models. So, we have neural nets that can generate images. That’s the thing we’re doing the most. But now we are also doing generation of sounds, of speech, and potentially we could synthesize any kind of object if we see enough examples of these types of objects. So, the computer could look at examples of natural images, and then create new images of some category that look fairly realistic right now. Still, obviously, you can recognize that they’re not the real thing, but you obviously see what the object is. So, we’ve made a lot of progress in the ability of the computer to dream up synthetic images or sounds or sentences. So, there’s a sense in which the computer is creative, and it can invent new poems, if you want, or new music. The only thing is, what it invents isn’t that great from a human point of view. In the sense that it’s not very original, it’s not very surprising. It still doesn’t really fit as well as what a human would be able to do. So, although computers can be creative and we have a lot of research in allowing computers to generate all kinds of new things that look reasonable, we are very far from the level of competence that humans have in doing this. So, why we are not there is linked to the central question that I mentioned in the beginning, which is, computers right now don’t have common sense. They don’t have a sufficiently broad understanding of how the world works. And without that common sense, without this causal understanding of what’s the relationships between high level explanations, and causes and effects, that’s still missing. And until we get there, the creativity of humans is going to be way, way over that of machines.
I reread Moby Dick a couple of months ago and I remember stopping on this one passage. And I’m going to misquote it, so, I apologize to all of the literary people out there that I’m going to mess this up. But it went something like, “And he piled forth on the whale’s white hump, the sum of all his rage and fury. If his chest had been a canon, he would’ve fired his heart upon it.” And I read that, and I put the book down and I thought, “How would a computer do that?” There’s so much going in there. There’s these rich and beautiful metaphors. “If his chest had been a canon he would’ve fired his heart upon it.” And why that one? And it does ask this question: Is creativity merely computational? Is it something that’s really is reducible down to, “Show me enough examples and I’ll start analogizing and coming up with other examples.” Do you think we’ll have our Herman Melville AI that will just write stuff like that before breakfast?
I do really believe that creativity is computational. It is something we can do on a small scale already, as I said earlier. It is something we understand the principles behind. So, it’s only a matter of having…only, right, but neural nets or models that are smarter, that understand the world better. So, I don’t think that creativity is something… I don’t think that any of the human faculties is something inherently inaccessible to computers. I would say that some aspects of humanity are less accessible and creativity of the kind that we appreciate is probably one that is going to be something that’s going to take more time to reach. But maybe even more difficult for computers, but also quite important, will be to understand not just human emotions, but also something a little bit more abstract, which is our sense of what’s right and what’s wrong. And this is actually an important question because when we’re going to put these computers in the world, in products, and they’re going to take decisions, well for some very simple things we know how to define the task, but sometimes the computer is going to be having to make a compromise between doing the task that it wants to do and maybe doing bad things in the world. And so, it needs to know what is bad. What is morally wrong? What is socially acceptable? And, I think we’ll manage to train computers to understand that, but it’s going to take a while as well.
You’ve mentioned machine translation a couple of times. And anybody who follows the literature is aware that, as you said earlier, that our ability to do machine translation has had some real breakthroughs. And you even said some of that, you and your team had a hand in. Can you describe in more laymen terms what exactly changed? What was the “a’ha” moment or what was the data set, or what was different that gave us this big boost?
So, I would mention two things. One actually dates back to work I did around 2000. So, this is a long time ago, something we call “word embeddings” or “word representations.” Where we trained the computer to associate with each word a pattern of activation. Think of it like the pattern of activation of neurons in your brain. So, it’s a bunch of numbers. And the thing that’s interesting about this is, you can think of it as like a semantic space. So now, whereas cat and dog are just two words and any two words are, you know, just symbols and there’s nothing in, say the spelling of the word “cat” and “dog” that tells us that cats and dogs have something in common. But, if you look at how your neurons fire when you see the picture of a cat or when you see the picture of a dog, or if you look at our neural nets and how the artificial neurons in these networks fire in response to a picture of a cat or a picture of a dog or a text which talks about cats or talks about dogs, well, actually those patterns are very similar. Because cats and dogs have many things in common. They’re pets and we have a particular relationship with them and so on. And so, we can use that to help the computer to generalize correctly to new cases. Even to new words that it has never seen, because maybe we’ve never seen the translation of that word, but we’ve seen that it’s associated with other words in the same language.
We can make associations that allow the computer to map symbols from sentences into these semantic spaces in which sentences that mean more or less the same thing will be represented more or less the same way. And so now you can go from, say, a sentence in French to that kind of semantic space and then from that semantic space you can decode into a sentence in English. So, that’s one aspect of the success of machine translation with deep learning. The other aspect is that we had really a big breakthrough when we understood that we could use something called an attention mechanism to help machine translation. So, the idea is actually fairly simple to explain. Imagine you want to translate a whole book from French to English. Well, before we introduced this attention mechanism, the strategy would have been the computer reads the book in French. It builds this semantic representation, like, you know all these activations of neurons. And then it uses this to write up the book in English. But this is very hard. Imagine having to hold the whole book in your head; it’s very hard. Instead, much better way is, I’m going to translate sort of one sentence at the time. Or even like keeping track in each book, in the French book and the English book that I’m producing, where I’m currently standing, right? So, I know that I’ve translated up to here in French and I’m looking at the words in the neighborhood to find out what the next word should be in English. So, we use attention mechanism, which allows the computer to pay attention more specifically to parts of the input—here, you would say, parts of the book that you want to translate. Or, for images, part of the image that you want to say something about. So, this is actually, of course, inspired by things we know from humans that use attention mechanism. Not just as an external device—you know, I look at something in front of me and I pay attention to a particular part of it—but also internally. Like, we can use our own attention to look back on different aspects of what we’ve seen or heard and of course that’s very, very useful for us for all kinds of cognitive tasks.
And in your mind, is it like, “You ain’t seen nothing yet. Just wait”? Or is it like, “This is eighty percent of what we know kind of today how to do, so don’t expect any big breakthroughs anytime soon”?
Well, this is like your number of years question.
I see, okay.
But I would say that it’s very likely that the pace of progress in AI is going to accelerate. And there’s a very simple mathematical reason. The number of researchers doing it is increasing exponentially right now. And, you know, science works by small steps, in spite of what sometimes may be said. The effect of scientific advances could be very drastic, because we pass a threshold that suddenly we can solve this task and we can build this product. But science itself is just an accumulation of ideas and concepts that’s very gradual. And the more people do it, the faster we can make progress. The more money we put in, the better facilities, and computing equipment, the faster we can make progress. So, because there’s a huge investment in AI right now, both in academia and in industry, the rate at which advances are coming and papers are being published is just increasing incredibly fast. So, that doesn’t mean that we might not face some brick wall and get stuck for a number of years trying to solve a problem that’s really hard. We don’t know. But my guess is that it’s going to continue to accelerate for a while.
Two more questions. First, you’re based in Canada and have done much work there. I see more and more AI things, writing and publications and so forth coming out of Canada. I saw that the Canadian government is funding some AI initiatives. Is that a fact that there seems to be a disproportional amount of AI investment in Canada?
It is a fact, and, actually, Canada has been the leader in deep learning since the beginning of deep learning. So, two of the main labs working on this are the ones in my group here in Montreal. And in Toronto, with Geoff Hinton. Another important place was in New York with Yann LeCun, and eventually Stanford and eventually other groups. But, so, a lot of the initial breakthroughs in deep learning happened here and we continue growing. So, for example, in Montreal we have, in terms of academic research, we have the largest group doing deep learning in the world. So, there’s a lot of papers and advances coming from Canada in AI. We also have, in Edmonton, Rich Sutton, which is one of the godfathers of reinforcement learning, which we didn’t talk about, which is when the machine learns by doing actions and getting feedback. So, there’s a sort of scientific advance, scientific expertise that up to now has been very strong for Canada and has been exported. Because our scientists have been bought, by US companies mostly. But, the Canadian government has understood that if they want some of the wealth coming out of AI to benefit Canada, then we need to have a Canadian AI industry. And so, there’s a lot of investment right now in the private sector. Government’s also investing in research centers. So, they’re going to create these institutes in Montreal, Toronto, and Edmonton. And, you know, companies are flocking to Montreal. Experts from around the world are coming here to do research, to build companies. So, there’s an amazing momentum going on.
And then finally, what are you working on right now? What are you excited about? What do you wake up in the morning just eager to get to work on?
Ah, I like that question. I’m trying to design learning procedures which would allow the machine to make sense of the world, and the way I think this can be done is if the machine can represent what it sees—so, it sees images, text and things like that—in a different form, which I call “disentangled.” So, in other words, trying to separate the different aspects of the world, the different causes of what we’re observing. That’s hard. We know that’s hard, but it has actually been the objective that we set for ourselves more than ten years ago when we started working on deep learning and I wrote this chapter in a book with Yann LeCun about the importance of extracting good representations that can separate out those factors. And the new thing is incorporating reinforcement learning, where the learning system, the learning agent interacts with the world so as to better understand the cause and effect relationships and so separate out the different causes from each other and sort of make sense of the world in this way. It’s a little bit abstract what I’m saying, but let’s say that it’s fundamental research. We could take decades to reach maturity. But I believe it is very important.
Well, thank you very much. I appreciate you taking the time to chat with us and good luck on your work.
My pleasure. Bye.
Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here

Announcing “Voices in AI” Podcast

Artificial intelligence has received more digital ink recently than just about any topic in technology. This level of coverage is warranted, since AI reaches new milestones seemingly every week. So whenever AI exceeds human capabilities in some new area, we collectively reflect on what this all means and speculate about what the machines might do next. And on this topic, everyone has opinions, including Elon Musk, Mark Zuckerberg, Bill Gates, Mark Cuban and Stephen Hawking.
The interesting thing to me is how different all of their opinions are. Some think the AIs will take away our jobs; others believe they will create better ones for everyone. Some say we should fear the AIs while others maintain that such concerns are science fiction nonsense. Some predict we will get a general AI in a few years; others think a few centuries.
Why is this? Why do so many smart, informed people have such different opinions on how this technology will unfold? That is the question I am seeking out an answer to with “Voices in AI.”
“Voices in AI” is a one-hour podcast in which I have a one-on-one conversation with people associated with AI in one way or the other. Our guests are from industry, academia, and include a number of writers, both fiction and nonfiction.
To date, we have recorded 35 episodes. Some of our guests are listed below:

  • James Barrat (author, “Our Final Invention”)
  • Yoshua Bengio (professor and author, “Deep Learning”)
  • Nick Bostrom (author, “Superintelligence”)
  • Soumith Chintala (Facebook)
  • Adam Coates (Baidu)
  • Nikola Danaylov (author, “Conversations With the Future”)
  • Jeff Dean (Google)
  • Pedro Domingos (professor and author, “The Master Algorithm”)
  • Esther Dyson (investor and author)
  • Oren Etzioni (Allen Institute for Artificial Intelligence)
  • Martin Ford (author, “Rise of the Robots”)
  • Carolina Galleguillos (Thumbtack)
  • Rand Hindi (Snips)
  • Bryon Jacobs (Data.world)
  • Daphne Koller (Calico Labs, Coursera)
  • Hugo Larochelle (Google)
  • Markus Noga (SAP)
  • Gregory Piatetsky (KDnuggets)
  • Mark Rolston (argodesign)
  • Suchi Saria (Johns Hopkins University)
  • Robert Sawyer (AI sci-fi author and speaker)
  • Rudina Seseri (investor)
  • Nova Spivack (investor and entrepreneur)
  • Mark Stevenson (author, “We Do Things Differently”)
  • Mike Tamir (Takt; lecturer, University of California, Berkeley)
  • Alan Winfield (professor and robot ethicist)
  • Roman Yampolskiy (professor and author, “Artificial Superintelligence”)

In most cases, I start the interview with a simple question: “What is artificial intelligence?” Even to this very basic question, I have yet to get the same answer twice. This fact alone is quite telling. There isn’t even any kind of consensus on whether artificial intelligence actually is intelligent or is just pretending to be, the way artificial turf isn’t really grass but just looks like it.
From this launching point, my guests and I go through the entire litany of issues, from the future of work to the use of AI in warfare. We discuss how similar machine intelligence is to human intelligence and try to understand how it is that humans so effortlessly do some things that machines are nowhere near able to do. We discuss the possibility of machine consciousness, whether humans in fact have general intelligence, and so on.
In short, the episodes are a full hour of in-depth discussion of what I think is the most interesting topic on the planet.
The podcast will be launched at the end of August. We will be releasing groups of episodes all at once for your binge-listening pleasure. Transcripts will be available for all of the episodes for those who prefer to read.
I am posting this article for three reasons:

  • We are looking for sponsors who want to be associated with this podcast and the issues it explores. Anyone interested in discussing this should send an email to [email protected].
  • We are looking for guests to be on the show. If you are in the AI industry in one form or another and these are topics you want to spend an hour exploring, send an email to [email protected]. Please include a short bio of yourself (or who you are nominating) along with a link to relevant audio or video.
  • If you would like to be notified when we launch and receive occasional emails on this topic, please add your name here.

It feels as though we are living at a great turning point in history, one driven by technology. If you want to be part of that conversation, I invite you to tune in.
I explore issues around artificial intelligence and conscious computers in my upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here.

#mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
/* Add your own MailChimp form style overrides in your site stylesheet or in this style block.
We recommend moving this block and the preceding CSS link to the HEAD of your HTML file. */