Updated: Tuesday night, I was schooled at playing Jeopardy by Watson in an exhibition match at the Computer History Museum, and discovered that despite our fear of the robot overlords, humans are much smarter than we think. Case in point: Watson could never use Apple’s (s aapl) personal assistant Siri.
While both services seemingly understand what we’re saying to them and can respond with amazingly functional or accurate answers, the truth is they are both still programmed for specific tasks and could never actually converse with one another or a human outside of a narrow context. So Watson can’t take dictation, and Siri can’t play Jeopardy. Understanding why shows how far we have to go when it comes to true artificial intelligence and those fears of the robots taking over.
As David Ferrucci, the guy at IBM (s ibm) behind Watson’s creation, explained during a conversation before the match, that as intuitive as the interactions with Siri or Watson appear to us, they are fundamentally task oriented. So the questions Watson gets are in effect “translated” not just into the zeros and ones of digital signals but also to a series of words that are then broken down into related concepts.
After that point, Watson tries to ascribe “meaning” to those contexts based on searches of unstructured data to derive an answer. It then determines which answer is most likely to be correct, and how confident it is in that top answer, because in Jeopardy, if someone guesses the wrong answer, he or she (or it) is penalized. Thus Watson’s tasks are figuring out the context associated with a question, figuring out which answer is the likeliest based on that context and if it’s confident enough in its probabilities to bother to answer.
Siri, on the other hand, does two important things: It recognizes speech (Watson actually doesn’t understand speech, but is fed a text version of the question), and it can figure out what steps to take in a limited number of applications once it understands the words in a natural language process related to the process by which Watson does. The sense from IBMers (unsurprisingly) is that Siri doesn’t have the natural language depth that Watson does. Siri certainly doesn’t have the computing horsepower behind it (2,880 processor cores and 15 terabytes of RAM), or the 100 GB of text data that Watson uses to figure out how different words relate to each other.
The net result of their differences? Not only could Siri and Watson not communicate because each relies on different input methods, but even if they could, their tasks are fundamentally far apart. Both have an ability to do natural language processing, but one uses that skill to find related information and figure out which information is most correct, while the other uses it to open applications and perform a set number of tasks.
So while Alan Turing proposed the best test of artificial intelligence is that a human can’t tell if it’s a human or computer he or she is interacting with, it may be more accurate to say the best test will be creating a machine that can not only understand natural language like Siri and Watson can, but has Watson’s ability to then determine the best course of action and Siri’s ability to take that action.
Update: IBM took issue with me saying that Watson can’t talk to Siri because Watson could indeed tell Siri, “What is Chile?” and Siri could go look it up. However, I meant talking as a synonym for conversing. Watson would have to be fed a text-based version of a question for it to come up with “What is Chile?” and Siri would have to overhear the question and could then provide the answer. That is technically talking, but it’s not a conversation.
All images, except for the Watson probabilities shot, courtesy of the Computer History Museum. The Watson probabilities shot is courtesy of IBM.