It didn’t take long for Edward Chang to see the implications of what he was doing. The neuroscientist and brain surgeon at the University of California, San Francisco, was studying the brain activity behind speech, that precise and delicate neural choreography by which lips, jaw, tongue, and larynx produce meaningful sounds.
By implanting an array of electrodes between the outer and inner membranes of the brain, directly over the area of the brain that controls speech, he and his team were able to detect distinct patterns of brain activity associated with specific sounds, each vowel and consonant, each duh, guh, ee, and ay sound that combine to form words.
“We realized that we had a code for every speech sound in the English language,” Chang says. And that realization opened up some astonishing possibilities.
In a series of papers published between 2019 and 2021, Chang and his team demonstrated how they could use machine learning, a form of artificial intelligence, to analyze the patterns. They immediately saw the potential benefits for people who’ve lost the ability to speak because of brain-stem stroke, cerebral palsy, ALS, or other forms of paralysis: Once people’s words and sentences are reconstructed through analysis of those brain patterns, the words can be displayed as text on a screen. More recently, the researchers demonstrated that the words a person is trying to say can even be translated into a computer-generated voice and facial movements on an on-screen avatar, enabling a paralyzed person to communicate not just with speech, but with facial expressions as well.