Saturday, August 24, 2019

"Death Calling"

They say it's the future. They say it makes things more efficient. They say it opens up whole new ways to control technology. And indeed, when it works, it is just this side of magic. Talk to your phone or your car or your home assistant, and its ability to listen, understand, respond in kind and do your bidding is truly remarkable. 

In fact, when you look at it in terms of its impact, speech recognition, and its more advanced cousin, voice transcription, has become perhaps the most important computing advance since the mouse. It fundamentally changes the way we interact with the technology that is embedded in our lives. It frees us from having to work on flat surfaces that support keyboards, and from even having to look at the devices whether on a desk or in our hands. Made possible by all the buzzwords of the moment – artificial intelligence, cloud computing, big data, neural nets – it has the potential to unlock access to almost infinite problem solving for all, even if you can't even read or write. 

However anyone who has ever tried it knows that the future isn't always now. While the systems have gotten much better, with error rates approaching the same as human transcribers, they seem to fail spectacularly as often as not. Add in noisy environments, like cars and trains and sidewalks, and the results can be downright comical, if not exasperating. Or as Gerald Friedland, a Principal Data Scientist at the Lawrence Livermore National Laboratory noted, "Depending who you ask, speech recognition is either solved or impossible." 

Just try asking your phone to do something for you. If you hit it just right, if the background sound isn't too much, if you have a solid connection, if you speak clearly and distinctly, the results can be impressive. "OK, Google, send an email to Brian Jones." In seconds, she responds: "Sure, what's the message?" You dictate, "Please call me about the Boston job." She parrots that back to you, adding a final, "Do you want to send it or change it?" You reply the former, and off it goes. HAL 9000 would be impressed. 

But note the many "ifs." The potential points of failure add up to as many misses as hits. How often have you asked it to call a person you speak to often and it comes back with "I'm sorry, I can't locate that person in your contacts." Try again, saying it slower and louder, and you get "I'm sorry, there is no one in your contacts with that name." Try it a third time, and it will either repeat the familiar refrain again, or just as likely, "Millard Fillmore was the 13th president of the United States." Well, let's not waste this moment of triumph: get him on phone. 

Numerous postings online show just how far afield the process can go. For every success there is an epic fail that makes you wonder what the underlying original request or statement could possibly have been, "Hi again This is Michael. So calling from Ralph there. Volkswagen lasagna." Well, sure. I guess better than Chevy tacos. Or how about, "I just wanted to let you know so that you weren't surprised if you come back for shower tomorrow that my cousin is girlfriend, maybe." That will be an awkward Thanksgiving dinner. 

The systems seem to have better success when the universe of words involved is in its original wheelhouse. Ask Alexa to set an oven timer and it never misses. Tell Siri to dial a series of digits, and you almost always get through. But then again, sometimes she (and yes, the voices are all female by default) seems like she's just messing with you. How else to explain a transcription like this: "Hi Allen my name is White and my number is area code (626) 523-8023 once again the number is (562) 652-3808." 

Then again, there are times when you want to avoid a call and not talk to the other party. In that case perhaps it is better that you let it go to voicemail and not get engaged. Or at least that would seem to be the case for the following. One wonders who was really calling. But if it was transcribed correctly? Then better to have not answered, "Hi Kelly. Death calling."

-END-

Marc Wollin of Bedford has yelled at his phone many times. His column appears regularly in The Record-Review, The Scarsdale Inquirer and online at http://www.glancingaskance.blogspot.com/, as well as via Facebook, LinkedIn and Twitter.

No comments: