The voice recognition revolution is almost here

By Pooja Bhatia, for

Feel like breaking your keyboard? Maybe, in a few years, you could.

For years, voice recognition was mainly a novelty for nerds or a pain-in-the-neck substitute for people unable to type. That’s changed, and fast — even in the three short years since Apple unleashed Siri onto a gazillion iPhones. Dictation software is more accurate, yes, but it’s more than that. Advances in natural language processing, which can understand the meaning of the words and recognize habits of speech, have given rise to brave new possibilities for the spoken word— like the Internet of Things and identification by speech recognition.

Even if talking to toasters isn’t your thing, glimmers of the spoken-word revolution are all around. Nowadays, law firms and medical practices are switching from analog dictation to voice recognition — the cost savings and efficiency gains astonish, they say. And in offices worldwide, there’s talk of resurrecting the near-dead art of dictation. Techies say the next generation of personal digital assistants, Siri’s daughters, will not only decipher what you say, but also follow your orders. Open the door? Drive to the grocery story? Write a thank-you note to Uncle Ted? Done.

It’s a heyday for post-literacy theorists, a tiny, ardent group who believe that the written word — handwritten, texted, printed, keyboarded — is on its way out. According to post-literacy theorist William Crossman, “Text has run its historic course and is now rapidly getting replaced in every area of our lives by the ever-increasing array of emerging ITs driven by voice, video, and body movement/gesture/touch rather than the written word.” Skeptics take note: Crossman wrote that prediction in The Futurist in 2012.

“Literacy is doomed,” says Michael Ridley, writing — again — in an e-book.

The paranoid Luddites among us wonder if a Fahrenheit 451 world can be far behind. Even before Bradbury fretted over state-ordered book burning, H.G. Wells had conjured a world in which “telephone, kinematograph and phonograph had replaced newspaper, book, schoolmaster, and letter.” For Wells, post-literacy was dystopia. Without the written word, civilization would collapse.

But theorists of the so-called post-literate world envision a future in which technology heightens communication and learning, instead of eroding it. It’s not a return to orality. Communication will be liberated from text, and the new IT tools will be as transformative as literacy was, though we probably don’t know what they are yet: brain-sensing headbands, perhaps, or bio-computing and techlepathy?

For now, Nuance Communications and its competitors would be happy to turn us all into dictators. (Nuance makes the best-known dictation software, Dragon, and Siri.) But even cheerleaders caution that civilization isn’t going post-literate anytime soon.

For starters, voice recognition still needs greater accuracy, and for specialized languages, like computer code or chemistry, symbols rule the day. Voice recognition will improve — maybe even become perfect — but relearning how to dictate memos, stories and emails might take a while for keyboarders. And the advantages of text-based communication, whether for managing risk in a booty call or avoiding drawn-out interactions, aren’t going away soon.

“I don’t believe society will be keyboardless, because there are environments, situations and applications where it just makes sense,” says Peter Mahoney, chief marketing officer at Nuance. “But I do think we’re seeing a resurgence of the skill of composing information with your voice — in some ways, its more natural and easy to do.”

And efficient. Proponents say that most people can speak at least three times as fast as they can type. If technology can do the typing, your clients might not want to pay you to do it. “You can see the payoff of dictation immediately because it’s so much faster than typing,” says cognitive scientist and consultant Luc Beaudoin. As an example, he cites Winston Churchill, winner of the Nobel Prize — not in Peace, but in Literature.

“What’s absolutely mind-boggling is that he wrote so much — newspapers, books, speeches, white papers — on his own,” says Beaudoin. “And if it hadn’t been for dictation, I don’t think he’d have had the patience to sit and write it out.”

Realizing advantages of dictation requires some neural rewiring, Beaudoin and others say. Keyboarding uses your brain’s motor cortex, linking the muscles in your hands to the composing parts of your brain — so you literally think through your fingers. On the other hand, dictation allows people to untether themselves from screens or paper, thus freeing up their imaginations, Beaudoin suspects.

But even if dictation takes off again, social scientists don’t see text becoming obsolete. Consider SMS. When it was introduced in the late 1980s, it was the poor man’s substitute for a real-time conversation, says Naomi Baron, a linguistics expert at American University. Yet people kept texting and IMing when the cost advantages disappeared. Many hundreds of billions of messages flew off last year.

“I believe it’s because we don’t always want to talk with people,” says Baron. Texting, unlike talking, gives users more control over communication, opening up space for manipulation, hiding, fronting, boundaries. Speech recognition will improve, says Baron, but human nature won’t change much. “If I were gazing in my crystal ball, I’d say the technological capabilities will outstrip our social preferences,” she says.

Moreover, we might speak much faster than we can type. But, at least in literate cultures, we can absorb printed information pretty quickly. Writing is uniquely amenable to scanning.

What’s more likely than a “post-literate” world is what Mahoney, of Nuance, calls “multimodal communication”: some typing, some touching or tapping, some voice, some gesture. Nuance aims to make technology “fluent with all forms of human communication,” he says.

So if you do grab a hammer for that keyboard, be sure to hang onto your reading glasses.

Share Article:
Dolbey Systems, Inc.