This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2024 Apr 10:2023.12.26.23300110.

doi: 10.1101/2023.12.26.23300110.

An accurate and rapidly calibrating speech neuroprosthesis

Nicholas S Card¹, Maitreyee Wairagkar¹, Carrina Iacobacci¹, Xianda Hou^{1

2}, Tyler Singer-Clark^{1

3}, Francis R Willett^{4

5

6}, Erin M Kunz^{5

7}, Chaofei Fan⁸, Maryam Vahdati Nia^{1

2}, Darrel R Deo⁴, Aparna Srinivasan^{1

3}, Eun Young Choi⁴, Matthew F Glasser⁹, Leigh R Hochberg^{10

11

12}, Jaimie M Henderson^{4

13}, Kiarash Shahlaie¹, David M Brandman¹, Sergey D Stavisky¹

Affiliations

¹ Departments of Neurological Surgery, University of California Davis, Davis, CA, USA.
² Departments of Computer Science, University of California Davis, Davis, CA, USA.
³ Departments of Biomedical Engineering, University of California Davis, Davis, CA, USA.
⁴ Departments of Neurosurgery, Stanford University, Stanford, CA, USA.
⁵ Departments of Electrical Engineering, Stanford University, Stanford, CA, USA.
⁶ Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA.
⁷ Departments of Mechanical Engineering, Stanford University, Stanford, CA, USA.
⁸ Departments of Computer Science, Stanford University, Stanford, CA, USA.
⁹ Departments of Radiology and Neuroscience, Washington University School of Medicine, Saint Louis, MO, USA.
¹⁰ School of Engineering and Carney Institute for Brain Sciences, Brown University, Providence, RI, USA.
¹¹ VA RR&D Center for Neurorestoration and Neurotechnology, VA Providence Healthcare, Providence, RI.
¹² Center for Neurotechnology and Neurorecovery, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA.
¹³ Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA.

PMID: 38645254
PMCID: PMC11030484
DOI: 10.1101/2023.12.26.23300110

An accurate and rapidly calibrating speech neuroprosthesis

Nicholas S Card et al. medRxiv. 2024.

[Preprint]. 2024 Apr 10:2023.12.26.23300110.

doi: 10.1101/2023.12.26.23300110.

Authors

Affiliations

¹ Departments of Neurological Surgery, University of California Davis, Davis, CA, USA.
² Departments of Computer Science, University of California Davis, Davis, CA, USA.
³ Departments of Biomedical Engineering, University of California Davis, Davis, CA, USA.
⁴ Departments of Neurosurgery, Stanford University, Stanford, CA, USA.
⁵ Departments of Electrical Engineering, Stanford University, Stanford, CA, USA.
⁶ Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA.
⁷ Departments of Mechanical Engineering, Stanford University, Stanford, CA, USA.
⁸ Departments of Computer Science, Stanford University, Stanford, CA, USA.
⁹ Departments of Radiology and Neuroscience, Washington University School of Medicine, Saint Louis, MO, USA.
¹⁰ School of Engineering and Carney Institute for Brain Sciences, Brown University, Providence, RI, USA.
¹¹ VA RR&D Center for Neurorestoration and Neurotechnology, VA Providence Healthcare, Providence, RI.
¹² Center for Neurotechnology and Neurorecovery, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA.
¹³ Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA.

PMID: 38645254
PMCID: PMC11030484
DOI: 10.1101/2023.12.26.23300110

Update in

An Accurate and Rapidly Calibrating Speech Neuroprosthesis.
Card NS, Wairagkar M, Iacobacci C, Hou X, Singer-Clark T, Willett FR, Kunz EM, Fan C, Vahdati Nia M, Deo DR, Srinivasan A, Choi EY, Glasser MF, Hochberg LR, Henderson JM, Shahlaie K, Stavisky SD, Brandman DM. Card NS, et al. N Engl J Med. 2024 Aug 15;391(7):609-618. doi: 10.1056/NEJMoa2314132. N Engl J Med. 2024. PMID: 39141853 Free PMC article.

Abstract

Brain-computer interfaces can enable rapid, intuitive communication for people with paralysis by transforming the cortical activity associated with attempted speech into text on a computer screen. Despite recent advances, communication with brain-computer interfaces has been restricted by extensive training data requirements and inaccurate word output. A man in his 40's with ALS with tetraparesis and severe dysarthria (ALSFRS-R = 23) was enrolled into the BrainGate2 clinical trial. He underwent surgical implantation of four microelectrode arrays into his left precentral gyrus, which recorded neural activity from 256 intracortical electrodes. We report a speech neuroprosthesis that decoded his neural activity as he attempted to speak in both prompted and unstructured conversational settings. Decoded words were displayed on a screen, then vocalized using text-to-speech software designed to sound like his pre-ALS voice. On the first day of system use, following 30 minutes of attempted speech training data, the neuroprosthesis achieved 99.6% accuracy with a 50-word vocabulary. On the second day, the size of the possible output vocabulary increased to 125,000 words, and, after 1.4 additional hours of training data, the neuroprosthesis achieved 90.2% accuracy. With further training data, the neuroprosthesis sustained 97.5% accuracy beyond eight months after surgical implantation. The participant has used the neuroprosthesis to communicate in self-paced conversations for over 248 hours. In an individual with ALS and severe dysarthria, an intracortical speech neuroprosthesis reached a level of performance suitable to restore naturalistic communication after a brief training period.

PubMed Disclaimer

Figures

**Figure 1.. Electrode locations and speech decoding setup.**
a, Approximate microelectrode array locations, represented by black squares, superimposed on a 3d reconstruction of the participant’s brain. Colored regions correspond to the Human Connectome Project’s multi-modal atlas of cortical areas aligned to the participant’s brain using the Human Connectome Project’s MRI protocol scans before implantation, concordant with the precentral gyrus on a MNI template brain (Figure S11). b, Diagram of the brain-to-text speech neuroprosthesis. Cortical neural activity is measured from the left ventral precentral gyrus using four 64-electrode Utah arrays. Machine learning techniques decode the cortical neural activity into an English phoneme every 80 ms. Using a series of language models (LM), the predicted phoneme sequence is translated into a series of words that appear on a screen as the participant tries to speak. At the end of a sentence, an own-voice text-to-speech algorithm vocalizes the decoded sentence designed to emulate the participant’s voice prior to developing ALS (Section S5).

**Figure 2.. Online speech decoding performance.**
Phoneme error rates (top) and word error rates (bottom) are shown for each session for two vocabulary sizes (50 versus 125,000 words). Reference error rates are plotted (horizontal dashed lines) for two previous speech neuroprosthesis studies^,. The horizontal axis displays the research session number, the number of days since arrays implant, and the cumulative hours of neural data used to train the speech decoder for that session. Aggregate error rates across all evaluation sentences are shown for each session (mean ± 95% confidence interval). Vertical dashed lines represent when decoder improvements were introduced. Fig. S20 shows phoneme and word error rates for individual blocks.

**Figure 3.. Extensive use of the neuroprosthesis for accurate self-initiated speech.**
a, Photograph of the participant and speech neuroprosthesis in Conversation Mode. The neuroprosthesis detected when he was trying to speak solely based on neural activity, and concluded either after 6 seconds of speech inactivity, or upon his optional activation of an on-screen button via eye tracking. After the decoded sentence was finalized, the participant used the on-screen confirmation buttons to indicate if the decoded sentence was correct. b, Sample transcript of our participant using the speech neuroprosthesis to speak to his daughter on the second day of use (Video 3). Additional transcripts are available in Table S4. c, Cumulative hours that the participant used the speech neuroprosthesis to communicate with those around him in structured research sessions and during personal use. For sessions represented by points outlined in red, decoding accuracy is quantified in (d). The distribution of self-reported decoding accuracy for each sentence across all Conversation Mode data (n = 21,829) is shown in the inset pie chart. Sentences where the participant did not self-report decoding accuracy within 30 seconds of sentence completion are excluded (n = 868). d, Evaluating speech decoding accuracy in conversations (n = 925 sentences with known true labels, sourced from red-labeled sessions in (c)). The average word error rate was 3.7% (95% CI, 3.3% to 4.3%).

See this image and copyright information in PMC

References

1. Coppens P. Aphasia and Related Neurogenic Communication Disorders. Jones & Bartlett Publishers; 2016.
1. Katz RT, Haig AJ, Clark BB, DiPaola RJ. Long-term survival, prognosis, and life-care planning for 29 patients with chronic locked-in syndrome. Arch Phys Med Rehabil 1992;73(5):403–8. - PubMed
1. Lulé D, Zickler C, Häcker S, et al. Life can be worth living in locked-in syndrome [Internet]. In: Laureys S, Schiff ND, Owen AM, editors. Progress in Brain Research. Elsevier; 2009. [cited 2023 Dec 11]. p. 339–51.Available from: https://www.sciencedirect.com/science/article/pii/S0079612309177233 - PubMed
1. Bach JR. Communication Status and Survival with Ventilatory Support. Am J Phys Med Rehabil 1993;72(6):343. - PubMed
1. Koch Fager S, Fried-Oken M, Jakobs T, Beukelman DR. New and emerging access technologies for adults with complex communication needs and severe motor impairments: State of the science. Augment Altern Commun Baltim Md 1985 2019;35(1):13–25. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

An accurate and rapidly calibrating speech neuroprosthesis

Affiliations

An accurate and rapidly calibrating speech neuroprosthesis

Authors

Affiliations

Update in

Abstract

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous