Machine translation of cortical activity to text with an encoder-decoder framework
- PMID: 32231340
- PMCID: PMC10560395
- DOI: 10.1038/s41593-020-0608-8
Machine translation of cortical activity to text with an encoder-decoder framework
Abstract
A decade after speech was first decoded from human brain signals, accuracy and speed remain far below that of natural speech. Here we show how to decode the electrocorticogram with high accuracy and at natural-speech rates. Taking a cue from recent advances in machine translation, we train a recurrent neural network to encode each sentence-length sequence of neural activity into an abstract representation, and then to decode this representation, word by word, into an English sentence. For each participant, data consist of several spoken repeats of a set of 30-50 sentences, along with the contemporaneous signals from ~250 electrodes distributed over peri-Sylvian cortices. Average word error rates across a held-out repeat set are as low as 3%. Finally, we show how decoding with limited data can be improved with transfer learning, by training certain layers of the network under multiple participants' data.
Conflict of interest statement
Competing interests
This work was funded in part by Facebook Reality Labs. UCSF holds patents related to speech decoding.
Figures







Comment in
-
Translating the brain.Nat Neurosci. 2020 Apr;23(4):471-472. doi: 10.1038/s41593-020-0616-8. Nat Neurosci. 2020. PMID: 32231339 No abstract available.
References
-
- Brumberg JS Kennedy PR & Guenther FH Artificial speech synthesizer control by brain–computer interface. In Interspeech, 636–639 (International Speech Communication Association, 2009).
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources