Google’s DeepMind brought us artificial intelligence systems that can play Atari classics and the complex game of Go as well as — no, better than — humans.
Now, the artificial intelligence research firm is at it again. This time, its machines are getting really good at sounding like humans.
In a blog post Thursday, DeepMind unveiled WaveNet, an artificial intelligence system that the company says outperforms existing text-to-speech technologies by 50 percent. WaveNet learns from raw audio files and then produces digital sound waves that resemble those produced by the human voice, which is an entirely different approach.
The result is more natural, smoother sounding speech, but that’s not all. Because WaveNet works with raw audio waveforms, it can model any voice, in any language. WaveNet can even model music.
And it did. It’s pretty good at piano. Listen for yourself.
Someday, man and machine will routinely strike up conversations ...