A high-performance neuroprosthesis for speech decoding and avatar control
- PMID: 37612505
- PMCID: PMC10826467
- DOI: 10.1038/s41586-023-06443-4
A high-performance neuroprosthesis for speech decoding and avatar control
Erratum in
-
Author Correction: A high-performance neuroprosthesis for speech decoding and avatar control.Nature. 2024 Jul;631(8021):E13. doi: 10.1038/s41586-024-07735-z. Nature. 2024. PMID: 38965438 No abstract available.
Abstract
Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive1. Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant's pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis.
© 2023. The Author(s), under exclusive licence to Springer Nature Limited.
Conflict of interest statement
Figures













Comment in
-
Restoring speech.Nat Rev Neurosci. 2023 Nov;24(11):653. doi: 10.1038/s41583-023-00746-1. Nat Rev Neurosci. 2023. PMID: 37740095 No abstract available.
References
-
- Beukelman DR et al. Augmentative and Alternative Communication (Paul H. Brookes, 1998).
-
- Graves A, Fernández S, Gomez F & Schmidhuber J Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proc. 23rd International Conference on Machine learning - ICML ’06 (eds Cohen W & Moore A) 369–376 (ACM Press, 2006); 10.1145/1143844.1143891. - DOI