In this episode, Byron and Hugo discuss consciousness, machine learning and more.
[podcast_player name=”Episode 29 – A Conversation with Hugo LaRochelle” artist=”Byron Reese” album=”Voices in AI” url=”https://voicesinai.s3.amazonaws.com/2018-01-15-(00-49-50)-hugo-larochelle.mp3″ cover_art_url=”https://voicesinai.com/wp-content/uploads/2018/01/voices-headshot-card-2.jpg”]
Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today I’m excited; our guest is Hugo Larochelle. He is a research scientist over at Google Brain. That would be enough to say about him to start with, but there’s a whole lot more we can go into. He’s an Associate Professor, on leave presently. He’s an expert on machine learning, and he specializes in deep neural networks in the areas of computer vision and natural language processing. Welcome to the show, Hugo.
Hugo Larochelle: Hi. Thanks for having me.
I’m going to ask you only one, kind of, lead-in question, and then let’s dive in. Would you give people a quick overview, a hierarchical explanation of the various terms that I just used in there? In terms of, what is “machine learning,” and then what are “neural nets” specifically as a subset of that? And what is “deep learning” in relation to that? Can you put all of that into perspective for the listener?
Sure, let me try that. Machine learning is the field in computer science, and in AI, where we are interested in designing algorithms or procedures that allow machines to learn. And this is motivated by the fact that we would like machines to be able to accumulate knowledge in an automatic way, as opposed to another approach which is to just hand-code knowledge into a machine. That’s machine learning, and there are a variety of different approaches for allowing for a machine to learn about the world, to learn about achieving certain tasks.
Within machine learning, there is one approach that is based on artificial neural networks. That approach is more closely inspired from our brains, from real neural networks and real neurons. It is still somewhat vaguely inspired by—in the sense that many of these algorithms probably aren’t close to what real biological neurons are doing—but some of the inspiration for it, I guess, is a lot of people in machine learning, and specifically in deep learning, have this perspective that the brain is really a biological machine. That it is executing some algorithm, and would like to discover what this algorithm is. And so, we try to take inspiration from the way the brain functions in designing our own artificial neural networks, but also take into account how machines work and how they’re different from biological neurons.
There’s the fundamental unit of computation in artificial neural networks, which is this artificial neuron. You can think of it, for instance, that we have neurons that are connected to our retina. And so, on a machine, we’d have a neuron that would be connected to, and take as input, the pixel values of some image on a computer. And in artificial neural networks, for the longest of time, we would have such neural networks with mostly a single layer of these neurons—so multiple neurons trying to detect different patterns in, say, images—and that was the most sophisticated type of artificial neural networks that we could really train with success, say ten years ago or more, with some exceptions. But in the past ten years or so, there’s been development in designing learning algorithms that leverage so called deep neural networks that have many more of these layers of neurons. Much like, in our brain we have a variety of brain regions that are connected with one another. How the light, say, flows in our visual cortex, it flows from the retina to various regions in the visceral cortex. In the past ten years there’s been a lot of success in designing more and more successful learning algorithms that are based on these artificial neural networks with many layers of artificial neurons. And that’s been something I’ve been doing research on for the past ten years now.
You just touched on something interesting, which is this parallel between biology and human intelligence. The human genome is like 725MB, but so much of it we share with plants and other life on this planet. If you look at the part that’s uniquely human, it’s probably 10MB or something. Does that imply to you that you can actually create an AGI, an artificial general intelligence, with as little as 10MB of code if we just knew what that 10MB would look like? Or more precisely, with 10MB of code could you create something that could in turn learn to become an AGI?
Perhaps we can make that parallel. I’m not so much an expert on biology to be able to make a specific statement like that. But I guess in the way I approach research—beyond just looking at the fact that we are intelligent beings and our intelligence is essentially from our brain, and beyond just taking some inspiration from the brain—I mostly drive my research on designing learning algorithms more from math or statistics. Trying to think about what might be a reasonable approach for this or that problem, and how could I potentially implement it with something that looks like an artificial neural network. I’m sure some people have a better-informed opinion as to what extent we can draw a direct inspiration from biology, but beyond just the very high-level inspiration that I just described, what motivates my work and my approach to research is a bit more taking inspiration from math and statistics.
Do you begin with a definition of what you think intelligence is? And if so, how do you define intelligence?
That’s a very good question. There are two schools of thought, at least in terms of thinking of what we want to achieve. There’s one which is we want to somehow reach the closest thing to perfect rationality. And there’s another one which is to just achieve an intelligence that’s comparable to that of human beings, in the sense that, as humans perhaps we wouldn’t really draw a difference between a computer or another person, say, in talking with that machine or in looking at its ability to achieve a specific task.
A lot of machine learning really is based on imitating humans. In the sense that, we collect data, and this data, if it’s labeled, it’s usually produced by another person or committee of persons, like crowd workers. I think those two definitions aren’t incompatible, and it seems the common denominator is essentially a form of computation that isn’t otherwise easily encoded just by writing code yourself.
At the same time, what’s kind of interesting—and perhaps evidence that this notion of intelligence is elusive—is there’s this well-known phenomenon that we call the AI effect, which is that it seems very often whenever we reach a new level of AI achievement, of AI performance for a given task, it doesn’t take a whole lot of time before we start saying that this actually wasn’t AI, but this other new problem that we are now interested in is AI. Chess is a little bit like that. For a long time, people would associate chess playing as a form of intelligence. But once we figured out that we can be pretty good by treating it as, essentially, a tree search procedure, then some people would start saying, “Well that’s not really AI.” There’s now this new separation where chess-playing is not AI anymore, somehow. So, it’s a very tough thing to pin down. Currently, I would say, whenever I’m thinking of AI tasks, a lot of it is essentially matching human performance on some particular task.
Such as the Turing Test. It’s much derided, of course, but do you think there’s any value in it as a benchmark of any kind? Or is it just a glorified party trick when we finally do it? And to your point, that’s not really intelligence either.
No, I think there’s value to that, in the sense that, at the very least, if we define a specific Turing Test for which we currently have no solution, I think it is valuable to try to then succeed in that Turing Test. I think it does have some value.
There are certainly situations where humans can also do other things. So, arguably, you could say that if someone plays against AlphaGo, but wasn’t initially told if it was AlphaGo or not—though, interestingly, some people have argued it’s using strategies that the best Go players aren’t necessarily considering naturally—you could argue that right now if you played against AlphaGo you would have a hard time determining that this isn’t just some Go expert, at least many people wouldn’t be able to say that. But, of course, AlphaGo doesn’t really classify natural images, or it doesn’t dialog with a person. But still, I would certainly argue that trying to tackle that particular milestone is useful in our scientific endeavor towards more and more intelligent machines.
Isn’t it fascinating that Turing said that, assuming the listeners are familiar with it, it’s basically, “Can you tell if this is a machine or a person you’re talking to over a computer?” And Turing said that if it can fool you thirty percent of the time, we have to say it’s smart. And the first thing you say, well why isn’t it fifty percent? Why isn’t it, kind of, indistinguishable? An answer to that would probably be something like, “Well, we’re not saying that it’s as smart as a human, but it’s intelligent. You have to say it’s intelligent if it can fool people regularly.” But the interesting thing is that if it can ever fool people more than fifty percent, the only conclusion you can draw is that it’s better at being human than we are…or seeming human.
Well definitely that’s a good point. I definitely think that intelligence isn’t a black or white phenomenon, in terms of something is intelligent or isn’t, it’s definitely a spectrum. What it means for someone to fool a human more than actual humans into thinking that they’re human is an interesting thing to think about. I guess I’m not sure we’re really quite there yet, and if we were there then this might just be more like a bug in the evaluation itself. In the sense that, presumably, much like we have now adversarial networks or adversarial examples, so we have methods that can fool a particular test. I guess it just might be more a reflection of that. But yeah, intelligence I think is a spectrum, and I wouldn’t be comfortable trying to pin it down to a specific frontier or barrier that we have to reach before we can say we have achieved actual AI.
To say we’re not quite there yet, that is an exercise in understatement, right? Because I can’t find a single one of these systems that are trying to pass the test that can answer the following question, “What’s bigger, a nickel or the sun?” So, I need four seconds to instantly know. Even the best contests restrict the questions enormously. They try to tilt everything in favor of the machine. The machine can’t even put in a showing. What do you infer from that, that we are so far away?
I think that’s a very good point. And it’s interesting, I think, to talk about how quickly are we progressing towards something that would be indistinguishable from human intelligence—or any other—in the very complete Turing Test type of meaning. I think that what you’re getting at is that we’re getting pretty good at a surprising number of individual tasks, but for something to solve all of them at once, and be very flexible and capable in a more general way, essentially your example shows that we’re quite far from that. So, I do find myself thinking, “Okay, how far are we, do we think?” And often, if you talk to someone who isn’t in machine learning or in AI, that’s often the question they ask, “How far away are we from AIs doing pretty much anything we’re able to do?” And it’s a very difficult thing to predict. So usually what I say is that I don’t know because you would need to predict the future for that.
One bit of information that I feel we don’t often go back to is, if you look at some of the quotes of AI researchers when people were, like now, very excited about the prospect of AI, a lot of these quotes are actually similar to some of the things we hear today. So, knowing this, and noticing that it’s not hard to think of a particular reasoning task where we don’t really have anything that would solve it as easily as we might have thought—I think it just suggests that we still have a fairly long way in terms of a real general AI.
Well let’s talk about that for just a second. Just now you talked about the pitfalls of predicting the future, but if I said, “How long will it be before we get to Mars?” that’s a future question, but it’s answerable. You could say, “Well, rocket technology and…blah, blah, blah…2020 to 2040,” or something like that. But if you ask people who are in this field—at least tangentially in the field—you get answers between five and five hundred years. And so that implies to me that not only do we not know when we’re going to do it, we really don’t know how to build an AGI.
So, I guess my question is twofold. One, why do you think there is that range? And two, do you think that, whether or not you can predict the time, do you think we have all of the tools in our arsenal that we need to build an AGI? Do you believe that with sufficient advances in algorithms, sufficient advances in processors, with data collection, etcetera, do you think we are on a linear path to achieve an AGI? Or is an AGI going to require some hitherto unimaginable breakthrough? And that’s why you get five to five hundred years because that’s the thing that’s kind of the black swan in the room?
That is my suspicion, that there are at least one and probably many technological breakthroughs—that aren’t just computers getting faster or collecting more data—that are required. One example, which I feel is not so much an issue with compute power, but is much more an issue of, “Okay, we don’t have the right procedure, we don’t have the right algorithms,” is being able to match how as humans we’re able to learn certain concepts with very little, quote unquote, data or human experience. An example that’s often given is if you show me a few pictures of an object, I will probably recognize that same object in many more pictures, just from a few—perhaps just one—photographs of that object. If you show me a picture of a family member and you show me other pictures of your family, I will probably identify that person without you having to tell me more than once. And there are many other things that we’re able to learn from very little feedback.
I don’t think that’s just a matter of throwing existing technology, more computers and more data, at it; I suspect that there are algorithmic components that are missing. One of them might be—and it’s something I’m very interested in right now—learning to learn, or meta-learning. So, essentially, producing learning algorithms from examples of tasks, and, more generally, just having a higher-level perspective of what learning is. Acknowledging that it works on various scales, and that there are a lot of different learning procedures happening in parallel and in intricate ways. And so, determining how these learning processes should act at various scales, I think, is probably a question we’ll need to tackle more and actually find a solution for.
There are people who think that we’re not going to build an AGI until we understand consciousness. That consciousness is this unique ability we have to change focus, and to observe the world a certain way and to experience the world a certain way that gives us these insights. So, I would throw that to you. Do you, A), believe that consciousness is somehow key to human intelligence; and, B), do you think we’ll make a conscious computer?
That’s a very interesting question. I haven’t really wrapped my head around what is consciousness relative to the concept of building an artificial intelligence. It’s a very interesting conversation to have, but I really have no clue, no handle on how to think about that.
I would say, however, that clearly notions of attention, for instance, being able to focus attention on various things or adding an ability to seek information, those are clearly components for which there’s, currently—I guess for attention we have some fairly mature solutions which work, thought in somewhat restrictive ways and not in the more general way; information seeking, I think, is still very much related to the notion of exploration and reinforcement learning—still a very big technical challenge that we need to address.
So, some of these aspects of our consciousness, I think, are kind of procedural, and we will need to figure out some algorithm to implement these, or learn to extract these behaviors from experience and from data.
You talked a little bit earlier about learning from just a little bit of data, that we’re really good at that. Is that, do you think, an example of humans being good at unsupervised learning? Because obviously as kids you learn, “This is a dog, and this is a cat,” and that’s supervised learning. But what you were talking about, was, “Now I can recognize it in low light, I can recognize it from behind, I can recognize it at a distance.” Is that humans doing a kind of unsupervised learning? Maybe start off by just explaining the concept and the hope about unsupervised learning, that it takes us, maybe, out of the process. And then, do you think humans are good at that?
I guess, unsupervised learning is, by definition, something that’s not supervised learning. It’s kind of an extreme of not using supervised learning. An example of that would be—and this is something I investigated quite a bit when I did my PhD ten years ago—to have a procedure, a learning algorithm, that can, for instance, look at images of hundreds of characters and be able to understand that each of these pixels in these images of characters are related. That they are higher-level concepts that explain why this is a digit. For instance, there is the concept of pen strokes; a character is really a combination of pen strokes. So, unsupervised learning would try to—just from looking at images, from the fact that there are correlations between these pixels, that they tend to look like something different than just a random image, and that pixels arrange themselves in a very specific way compared to any random combination of pixels—be able to extract these higher-level concepts like pen stroke and handwritten characters. In a more complex, natural scene this would be identifying the different objects without someone having to label each object. Because really what explains what I’m seeing is that there’s a few different objects with a particular light interacting with the scene and so on.
That’s something that I’ve looked at quite a bit, and I do think that humans are doing some form of that. But also, we’re, probably as infants, we’re interacting with our world and we’re exploring it and we’re being curious. And that starts being something a bit further away from just pure unsupervised learning and a bit closer to things like our reinforcement learning. So, this notion that I can actually manipulate my environment, and from this I can learn what are its properties, what are the facts and the variations that characterize this environment?
And there’s an even more supervised type of learning that we see in ourselves as infants that is not really captured by purely supervised learning, which is being able to exchange or to learn from feedback from another person. So, we might imitate someone, and that would be closer to supervised learning, but we might instead get feedback that’s worded. So, if a parent says do this or don’t do that, this isn’t exactly an imitation this is more like a communication of how you should adjust your behavior. And this is a form of weakly supervised learning. So, if I tell my kid to do his or her homework, or if I give instructions on how to solve a particular problem set, this isn’t a demonstration, so this isn’t supervised learning. This is more like a weak form of supervised learning. Which even then I think we don’t use as much in the known systems that work well currently that people are using in object recognition systems or machine translation systems and so on. And so, I believe that these various forms of learning that are much less supervised than the common supervised learning is a direction in research where we still have a lot of progress to make.
So earlier you were talking about meta learning, which is learning how to learn, and I think there’s been a wide range of views about how artificial intelligence and an AGI might work. And on one side was an early hope that, like the physical universe which is governed just by very few laws, and magnetism very few laws, electricity very few laws, we hoped that intelligence was governed by just a very few laws that we could learn. And then on the other extreme you have people like the late Marvin Minsky who really saw the brain as a hack of a couple of hundred narrow AIs, that all come together and give us, if not a general intelligence at least a really good substitute for one. I guess a belief in meta learning is a belief in the former case, or something like it, that there is a way to learn how to learn. There’s a way to build all those hacks. Would you agree? Do you think that?
We can take one example there. I think under a somewhat general definition of what learning to learn or meta learning is, it’s something that we could all agree exists, which is, as humans, we’re the result of years of evolution. And evolution is a form of adaptation, I guess. But then within our lifespan, each individual will also adapt to its specific human experience. So, you can think of evolution as being kind of like the meta learning to the learning that we do as humans in our individual lives every day. But then even in our own lives, I think there are clearly ways in which my brain is adapting as I’m growing older from a baby to an adult, that are not conscious. There are ways in which I’m adapting in a rational way, in conscious ways, which rely on the fact that my brain has adapted to be able to perceive my environment—my visual cortex just maturing. So again, there are multiple layers of learning that rely on each other. And so, I think this is, at a fairly high level, but I think in a meaningful way, a form of meta learning. For that reason, I think that investigating how to have learning of learning systems is that there is a process that’s valuable here in informing how to have more intelligent agents and AIs.
There’s a lot of fear wrapped up in the media coverage of artificial intelligence. And not even getting into killer robots, just the effects that it’s going to have on jobs and employment. Do you share that? And what is your prognosis for the future? Is AI in the end going to increase human productivity like all other technologies have done, or is AI something profoundly different that’s going to harm humans?
That’s a good question. What I can say is that I am motivated by—and what makes me excited about AI—is that I see it as an opportunity of automating parts of my day-to-day life which I would rather be automated so I can spend my life doing more creative things, or the things that I’m more passionate about or more interested in. I think largely because of that, I see AI as a wonderful piece of technology for humanity. I see benefits in terms of better machine translation which will better connect the different parts of the world and allow us to travel and learn about other cultures. Or how I can automate the work of certain health workers so that they can spend more time on the harder cases that probably don’t receive as much attention as they should.
For that reason—and because I’m personally motivated automating these aspects of life which we would want to see automated—I am fairly optimistic about the prospects for our society to have more AI. And, potentially, when it comes to jobs we can even imagine automating our ability to progress professionally. Definitely there’s a lot of opportunities in automating part of the process of learning in a course. We now have many courses online. Even myself when I was teaching, I was putting a lot of material on YouTube to allow for people to learn.
Essentially, I identified that the day-to-day teaching that I was doing in my job was very repetitive. It was something that I could record once and for all and instead focus my attention on spending time with the student and making sure that each individual student solves its own misunderstanding about the topic. Because my mental model of the student in general is that it’s often unpredictable how they will misunderstand a particular aspect of the course. And so, you actually want to spend some time interacting with that student, and you want to do that with as many students as possible. I think that’s an example where we can think of automating particular aspects of education so as to support our ability to have everyone be educated and be able to have a meaningful professional life. So, I’m overall optimistic, largely because of the way I see myself using AI and developing AI in the future.
Anybody who’s listened to many episodes of the show will know I’m very sympathetic to that position. I think it’s easy to point to history and say in the last two hundred and fifty years, other than the depression which wasn’t caused by technology obviously, unemployment has been between five and nine percent without fail. And yet, we’ve had incredibly disruptive technologies, like the mechanization of industry, the replacement of animal power with human power, electrification, and so forth. And in every case, humans have used those technologies to increase their own productivity and therefore their incomes. And that is the entire story of the rising standard of living for everybody, at least in the western world.
But I would be remiss not to make the other case, which is that there might be a point, an escape velocity, where a machine can learn a new job faster than a human. And at that point, at that magic moment, every new job, everything we create, a machine would learn it faster than a human. Such that, literally, everything from Michael Crichton down to…everybody—everybody finds themselves replaced. Is that possible? And if that really happened, would that be a bad thing?
That’s a very good question I think for society in general. Maybe because my day-to-day is about identifying what are the current challenges in making progress in AI, I see—and I guess we’ve touched that a little bit earlier—that there are still many scientific challenges, that it doesn’t seem like it’s just a matter of making computers faster and collecting more data. Because I see these many challenges, and because I’ve seen that the scientific community, in previous years, has been wrong and has been overly optimistic, I tend to err on the side of less gloomy and a bit more conservative in how quickly we’ll get there, if we ever get there.
In terms of what it means for society—if that was to ever happen that we can automate essentially most things—I unfortunately feel ill-equipped as a non-economist to be able to really have a meaningful opinion about this. But I do think it’s good that we have a dialog about it, as long as it’s grounded in facts. Which is why it’s a difficult question to discuss, because we’re talking about a hypothetical future that might not exist before a very long time. But as long as we have, otherwise, a rational discussion about what might happen, I don’t see a reason not to have that discussion.
It’s funny. Probably the truest thing that I’ve learned from doing all of these chats is that there is a direct correlation between how much you code and how far away you think an AGI is.
That’s quite possible.
I could even go further to say that the longer you have coded, the further away you think it is. People who are new at it are like, “Yeah. We’ll knock this out.” And the other people who think it’s going to happen really quickly are more observers. So, I want to throw a thought experiment to you.
It’s a thought experiment that I haven’t presented to anybody on the show yet. It’s by a man named Frank Jackson, and it’s the problem of Mary, and the problem goes like this. There’s this hypothetical person, Mary, and Mary knows everything in the world about color. Everything is an understatement. She has a god-like understanding of color, everything down to the basic, most minute detail of light and neurons and everything. And the rub is that she lives in a room that she’s never left, and everything she’s seen is black and white. And one day she goes outside and she sees red for the first time. And the question is, does she learn anything new when that happens that she didn’t know before? Do you have an initial reaction to that?
My initial reaction is that, being colorblind I might be ill-equipped to answer that question. But seriously, so she has a perfect understanding of color but—just restating the situation—she has only seen in black and white?
Correct. And then one day she sees color. Did she learn anything new about color?
By definition of what understanding means, I would think that she wouldn’t learn anything about color. About red specifically.
Right. That is probably the consistent answer, but it’s one that is intuitively unsatisfying to many people. The question it’s trying to get at is, is experiencing something different than knowing something? And if in fact it is different, then we have to build a machine that can experience things for it to truly be intelligent, as opposed to just knowing something. And to experience things means you return to this thorny issue of consciousness. We are not only the most intelligent creature on the planet, but we’re arguably the most conscious. And that those two things somehow are tied together. And I just keep returning to that because it implies, maybe, you can write all the code in the world, and until the machine can experience something… But the way you just answered the question was, no, if you know everything, experiencing adds nothing.
I guess, unless that experience would somehow contradict what you know about the world, I would think that it wouldn’t affect it. And this is partly, I think, one challenge about developing AI as we move forward. A lot of the AIs that we’ve successfully developed that have to do with performing a series of actions, like playing Go for instance, have really been developed in a simulated environment. In this case, for a board game, it’s pretty easy to simulate it on a computer because you can literally write all the rules of the game so you can put them in the computer and simulate it.
But, for an experience such as being in the real world and manipulating objects, as long as that simulated experience isn’t exactly what the experience is in the real world, touching real objects, I think we will face a challenge in transferring any kind of intelligence that we grow in simulations, and transfer it to the real world. And this partly relates to our inability to have algorithms that learn rapidly. Instead, they require millions of repetitions or examples to really be close to what humans can do. Imagine having a robot go through millions of labeled examples from someone manipulating that robot, and showing it exactly how to do everything. That robot might essentially learn too slowly to really learn any meaningful behavior in a reasonable amount of time.
You used the word transfer three or four times there. Do you think that transfer learning, this idea that humans are really good at taking what we know in one domain space and applying it in another—you know, you walk around one big city and go to a different big city and you kind of map things. Is that a useful thing to work on in artificial intelligence?
Absolutely. In fact, we’re seeing that with all the success that has been enabled by the ImageNet data set and the competition. It turns out if you train an object recognition system on this large ImageNet data set, it really is responsible for the revolution of deep neural nets and convolutional neural nets in the field of computer vision. It turns out that these models trained on that source of data could transfer really well to a surprising number of paths. And that has very much enabled a kind of a revolution in computer vision. But it’s a fairly simple type of transfer, and I think there are more subtle ways of transferring, where you need to take what you knew before but slightly adjust it. How to do to that without forgetting what you learned before? So, understanding how these different mechanisms need to work together to be able to perform a form of lifelong learning, of being able to accumulate one task after another, and learning each new task with less and less experience, is something I think currently we’re not doing as well as we need to.
What keeps you up at night? You meet a genie and you rub the bottle and the genie comes out and says, “I will give you perfect understanding of something.” What do you wrestle with that maybe you can phrase in a way that would be useful to the listeners?
Let’s see. That’s a very good question. Definitely, in my daily research, how are we able to accumulate knowledge, and how would a machine accumulate knowledge, in a very long period, and learn the sequence of tasks and abilities in a sequence, cumulatively, is something that I think a whole lot about. And this has led me to think about learning to learn, because I suspect that there are ideas. And effectively once you have to learn one ability after the other after the other, that process of doing this and doing it better, the fact that we do it better is, perhaps, because we are learning how to learn each task also. That there’s this other scale of learning that is going on. How to do this exactly I don’t quite know, and knowing this I think would be a pretty big step in our field.
I have three final questions, if I could. You’re in Canada, correct?
As it turns out, I’m currently still in the US because I have four kids, two of them are in school so I wanted them to finish their school year before we move. But the plan is for me to go to Montreal, yes.
I noticed something. There’s a lot of AI activity in Canada, a lot of leading research. How did that come about? Was that a deliberate decision or just a kind of a coincidence that different universities and businesses decided to go into that?
If I speak for Montreal specifically, very clearly at the source of it is Yoshua Bengio deciding to stay in Montreal, staying in academia, and then continuing to train many students, gathering other researchers that are also in his group, and also training more PhDs in the field that doesn’t have as much talent as is needed. I think this is essentially the source of it.
And then my second to the last question is, what about science fiction? Do you enjoy it in any form, like movies or TV or books or anything like that? And if so, is there any that you look at it and think, “Ah, the future could happen that way”?
I definitely used to be more into science fiction. Now maybe due to having kids I watch many more Disney movies than I watch science fiction. It’s actually a good question. I’m realizing I haven’t watched a sci-fi movie for a bit, but it would be interesting, now that I’ve actually been in this field for a while, to sort of confront my vision of it from how artists potentially see AI. Maybe not seriously. A lot of art is essentially philosophy around what could happen, or at least projecting a potential future and seeing how we feel about it. And for that purpose, I’m now tempted to revisit either some classics or seeing what are recent sci-fi movies.
I said only one more question, so I’ve got to combine two into one to stick with that. What are you working on, and if a listener is going into college or is presently in college and wants to get into artificial intelligence in a way that is really relevant, what would be a leading edge that you would say somebody entering the field now would do well to invest time in? So first, you, and then what would you recommend for the next generation of AI researchers?
As I’ve mentioned, perhaps not so surprisingly, I am very much interested in learning to learn and meta learning. I’ve started publishing on the subject, and I’m still very much thinking around various new ideas for meta learning approaches. And also learning from, yes, weaker signals than in the supervised learning setting. Such as learning from worded feedback from a person is something I haven’t quite started working on specifically, but I’m thinking a whole lot about these days. Perhaps those are directions that I would definitely encourage other young researchers to think about and study and research.
And in terms of advice, well, I’m obviously biased, and being in Montreal studying deep learning and AI, currently, is a very, very rich and great experience. There are a lot of people to talk to, to interact with, not just in academia but now much more in industry, such as ourselves with Google and other places. And also, being very active online. On Twitter, there’s now a very, very rich community of people sharing the work of others and discussing the latest results. The field is moving very fast, and in large part it’s because the deep learning community has been very open about sharing its latest results, and also making the discussion open about what’s going on. So be connected, whether it be on Twitter or other social networks, and read papers and look at what comes up on archives—engage in the global conversation.
Alright. Well that’s a great place to end. I want to thank you so much. This has been a fascinating hour, and I would love to have you come back and talk about your other work in the future if you’d be up for it.
Of course, yeah. Thank you for having me.
Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here.