Singing is an extraordinary human skill. It requires the ability to form words, then the ability to vocalize them at a certain pitch and finally the ability to synchronize this with the notes. For many, it comes naturally — humans seemed to be hardwired for singing.
Not so machines. Teaching a computer to sing — to turn a musical score into vocalized song — has turned out to be hugely frustrating.
First, these devices must master the ability to turn text into speech, which is itself an ongoing challenge in computer science. They must then match the words to the notes at the level of syllables and even at the level of phonemes. Finally, these phonemes, syllables and words need to be vocalized at the correct pitch and for the right duration.