iSpeech, a Newark, N.J.-based startup that specializes in lifelike text-to-speech apps and previously rolled out voice technology for the connected home, is launching a platform for publishers, the company plans to announce Tuesday. The tools are designed to help publishers quickly and inexpensively convert books and articles into audio. iSpeech’s first two publishing clients are Evernote and Pearson.
iSpeech gives publishers three options for creating content. They can convert PDFs to audio files; they can add a widget to a website that essentially adds a “play” button to an article; or they can use more sophisticated developer tools built on iSpeech’s API and add them directly to their web pages. Pearson is using the PDF option for its textbooks. Evernote is using the developer tools to integrate speech technology into its web reading platform Evernote Clearly. “The natural evolution of this is to potentially bring this functionality into all of Evernote’s products,” iSpeech COO Yaron Oren told me. “One of the things we hear directly from Evernote customers is they want to be able to listen to their Evernote notes in the car, so it would be great to have this kind of of functionality.”
The publishing platform’s business model is basically pay-per-use, Oren said, and the cost usually ends up totaling “less than a tenth of of the cost of professional narration.” For websites, iSpeech charges by the word, which varies depending on volume but ranges from $0.01 to a fraction of a cent per word. For books, the company charges by the page; there are volume discounts, but Oren said that in general, the maximum cost to convert a 250-page book to audio with iSpeech would cost around $1,000. “We’ve heard from publishers that a book with voice talent tends to cost in the order of $15,000 per book,” he said.
Amazon ran into legal trouble when, in 2009, it automatically added text-to-speech technology to ebooks. The company insisted that text-to-speech features don’t violate copyright, but said at the time, “We strongly believe many rights-holders will be more comfortable with the text-to-speech feature if they are in the driver’s seat,” and decided to let rights-holders “decide on a title by title basis whether they want text-to-speech enabled or disabled for any particular title.” Oren says iSpeech will avoid those issues by leaving the decision to publishers — though it seems as if Evernote Clearly could potentially run into trouble, since it doesn’t hold the copyright to the articles that users save to its platform. (Evernote says it’s “comfortable” with the feature and is only running it on the article pages, not on articles saved into Evernote.)
For now, iSpeech’s publisher tools are primarily going to be of interest to nonfiction publishers — not publishers of, say, novels. “It’s a viable alternative to nonfiction, textbooks, or more straightforward news content,” Oren said. “For fiction, or other content where there’s more emotion and differences in reading style, this is not an alternative [yet].” But, he said, “this is about making more content available [as audio]. Professional voice talent is very expensive, and as a result, most books never get made into an audio format. Now there’s an option.”