A new, much more efficient system gives back the voice to patients who have lost it

in #science11 days ago

image.png
source

After years of studies and improvements, one has arrived that could potentially change the lives of those who have completely lost the ability to speak: a new algorithm has been developed that allows the construction of neuroprostheses capable of transforming brain signals into audible and natural speech in real time. This was revealed by a new study conducted by scientists at the University of California, Berkeley (UC), subjected to peer review and published in the scientific journal Nature Neuroscience. Although it previously took about eight seconds to generate a sentence, with the new system – built through a clinical study and with the help of artificial intelligence – the first sound is emitted within a second of the intention to speak. "We discovered that we could decode neural data and, for the first time, enable almost synchronous vocal streaming. The result is a more natural and fluid speech synthesis,” said Gopala K. Anumanchipalli, associate professor of electrical engineering and computer science at UC Berkeley and the study’s co-lead investigator.

People with severe forms of paralysis, such as amyotrophic lateral sclerosis (ALS) or stroke, can lose the ability to speak entirely. Brain-computer interfaces had opened a way to restore some form of communication, but until now, the available solutions were slow, cumbersome and often limited to text translation. Previous attempts at speech synthesis suffered from excessive latency, making smooth conversation impossible. To overcome this limitation, the California researchers used advanced artificial intelligence models, similar to those that power voice assistants like Siri and Alexa, to develop a system that could decode neural signals in a continuous stream. Specifically, the technology was tested in a clinical trial at UC, where a subject with paralysis was able to “speak” by uttering silent sentences that were transformed into speech almost simultaneously.

By sampling signals from the motor cortex – the area of ​​the brain that controls the movements of the vocal tract – these impulses were then interpreted by an AI model that translated them into sounds. To train the system, the researchers asked the patient, Ann, to read sentences on the screen and try to pronounce them in her head. With no real speech output, the algorithm then reconstructed the missing audio using pre-trained models, including Ann’s voice recorded before the injury. The result is a fluid, continuous and personalized synthesis, which the patient herself described as “more controllable” than previous methods. Additionally, the system was also tested on new words, proving that it could generalise beyond the training data. “This is a revolutionary framework,” said co-author Cheol Jun Cho, adding that the AI ​​doesn’t just repeat learned data, but learns the fundamental principles of phonation: “We can now work to further improve the speed, expressiveness, and naturalism of the generated speech.”