Voice Recognition Is Flying, Needs Focus

[qi:83] Back in September 2006, Nokia (NOK) CEO Olli-Pekka Kallasvuo declared that phones were no longer phones but multimedia computers that play back music, record videos, snap photos and — oh, yes — make phone calls. Apple’s (AAPL) iPhone has only reinforced that notion. And as the phone morphs into a multimedia marvel, there is a growing realization that the traditional user interface of a phone, the 12-key keypad, may no longer be enough.
The keypad limits how much information we can input into the tiny devices, and acts as a speed bump when we’re trying to navigate through a complex array of features. And what that means is that we need new ways to interact with mobile devices. Apple, for one, has bet on the touch screen and the fluid UI.
And then there are those who believe that voice input is the way to go.
Microsoft (MSFT) bet about $900 million when it bought TellMe Networks. Some startups are voice believers as well, such as Cambridge, Mass.-based Vlingo Corp, which I’ve previously written about. Earlier this week, I got a chance to see a demo by Yap, a Charlotte, N.C.-based company with a similar approach — that is, taking voice and inputting it as text for everything from IM to navigation.
Like Vlingo, you need to download a Yap application on your mobile phone to get going, and then use voice to enter everything from instant messages to TwitterGrams. Yap also does voice processing on the server side, and then sends information back on the mobile data channel.
There are others who are taking speech recognition even further — embedding it right into the chips that go into Bluetooth headsets. Cambridge Silicon Radio, a maker of Bluetooth chips, is now embedding speech recognition technology from Sunnyvale, Calif.-based Sensory into chips that will find their way into the Bluetooth headsets by the first quarter of 2008.
Sensory CEO Todd Mozer believes that everyone wants to do big things with speech recognition and mobiles, but in the end, the simple functions that enhance the hands-free experience are what make the most sense and are the most useful. Bluetooth devices make perfect sense as a starting point for voice commands. I agree with Mozer.
When I saw the Yap demo, I got the feeling that the application was trying to do too much; it needed some focus. After all, Yahoo (YHOO) and Google (GOOG) can take a similar server-centric voice synthesis approach and provide a more enhanced offering. Moreover, they can use their partnerships with large carriers to squeeze out little players such as Yap.
P.S.: All this interest in voice recognition and related technologies could explain the nice bump in the share price of Nuance (NUAN): up 64 percent for the year, even despite a recent pullback that’s mostly because company’s move into the mobile voice recognition arena isn’t sitting well with the Wall Street types.
Related Post: Sit Up and Listen, Future of Software.