Talk to Your Gadgets

AT&T's Watson leads a pack of new gadgets that understand spoken instructions.

By Stephen CassNov 26, 2012 8:42 PM
NULL | Courtesy AT&T


Sign up for our email newsletter for the latest science news

The builders of mobile gadgets face a paradox. They want to make the most powerful device they can, squeezed into the smallest box possible. But for a device to be useful, human beings have to be able to interact with all its features. More and more functions mean more and more buttons—and humans have stubbornly remained the same size and shape. A button can be made only so small before it becomes impossible to press, putting a tough limit on miniaturization. Different devices confront this paradox in different ways: Cell phone keypad buttons routinely do double, triple, and even quadruple duty, while devices like tablet computers use touch screens and gesture recognition.

AT&T is developing another solution. It wants you to be able simply to talk to an electronic device and have it follow your instructions. While some cell phones already offer voice recognition for basic tasks, such as looking up phone numbers in a contact list, AT&T envisions devices that can handle much more complicated voice commands, such as “Tell me where I can find the nearest ATM” or “Order me a pepperoni pizza.”

For decades AT&T has been working on a voice recognition system that can handle just such requests. Known as Watson, it is so complex that it is more practical to run the software on centralized servers than to install, manage, and maintain it on countless mobile devices. Fortunately, today’s mobile devices have the ability to connect to the Internet in spades. By including some very basic hardware and software to capture and compress speech (which phones already possess), any device can be given the gift of voice recognition. Captured speech is sent, via the Internet or a cell phone network, to AT&T computers running Watson. The Watson software analyzes the speech and sends back a digital response that the device can translate into commands. To demonstrate the principle, AT&T researchers have built a voice-operated television remote control. Designed to work with AT&T’s Internet TV service, U-verse, the remote lets you do things like ask it to find any comedies that might be on TV now or to search the listings for movies starring Bruce Willis.

AT&T is already working with developers to create prototypes for other real-world applications —a yellow pages application for the iPhone, for instance—and expects to make more announcements about the future of this technology in the next few months.

How it Works AT&T’s networked voice recognition system is a mash-up. A mash-up is software that uses the Internet to glue different programs with different capabilities together. Here, the goal is to merge a general voice recognition application—Watson—with things like information databases or the specialized software that runs a cable television or digital video recorder. In the example of a TV remote control, the remote captures speech from the user—“I want to see Channel 114”—compresses it, and uses a wireless connection to send it to a server running Watson. Watson not only recognizes individual words but can also be programmed to extract some meaning from simple sentences. It does this using sets of rules that can digest a variety of naturally spoken sentences into standardized text—for example, “What is the time?” means the same thing as “Tell me the time.” The text can then be translated by software running on the device into actual machine commands, such as transmitting to a television the signal to select a particular channel

From left courtesy Nuance Communications; G. G. Electronics; Magellan

Buy it Now: Voice recognition is already proving itself in places where people don’t have, or can’t use, a keyboard.

Magellan Maestro 4250: This GPS navigator understands a selection of common questions, such as “Where am I?” and “Nearest gas?” so you can keep your hands on the wheel

Vocally Infinity: When connected to a phone, this device will retain up to 60 phone numbers and will dial them in response to a spoken name. It is aimed at people who have difficulty using a numeric keypad, such as some elderly.

Dragon NaturallySpeaking 10: For those who need to create a lot of text without a keyboard—office workers with repetitive strain injury, for instance—NaturallySpeaking 10 translates speech to printed words on a desktop or laptop computer.

1 free article left
Want More? Get unlimited access for as low as $1.99/month

Already a subscriber?

Register or Log In

1 free articleSubscribe
Discover Magazine Logo
Want more?

Keep reading for as low as $1.99!


Already a subscriber?

Register or Log In

More From Discover
Recommendations From Our Store
Shop Now
Stay Curious
Our List

Sign up for our weekly science updates.

To The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Copyright © 2023 Kalmbach Media Co.