Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec;15(6):066025.
doi: 10.1088/1741-2552/aae329. Epub 2018 Sep 21.

The influence of prior pronunciations on sensorimotor cortex activity patterns during vowel production

Affiliations

The influence of prior pronunciations on sensorimotor cortex activity patterns during vowel production

E Salari et al. J Neural Eng. 2018 Dec.

Abstract

Objective: In recent years, brain-computer interface (BCI) systems have been investigated for their potential as a communication device to assist people with severe paralysis. Decoding speech sensorimotor cortex activity is a promising avenue for the generation of BCI control signals, but is complicated by variability in neural patterns, leading to suboptimal decoding. We investigated whether neural pattern variability associated with sound pronunciation can be explained by prior pronunciations and determined to what extent prior speech affects BCI decoding accuracy.

Approach: Neural patterns in speech motor areas were evaluated with electrocorticography in five epilepsy patients, who performed a simple speech task that involved pronunciation of the /i/ sound, preceded by either silence, the /a/ sound or the /u/ sound.

Main results: The neural pattern related to the /i/ sound depends on previous sounds and is therefore associated with multiple distinct sensorimotor patterns, which is likely to reflect differences in the movements towards this sound. We also show that these patterns still contain a commonality that is distinct from the other vowel sounds (/a/ and /u/). Classification accuracies for the decoding of different sounds do increase, however, when the multiple patterns for the /i/ sound are taken into account. Simply including multiple forms of the /i/ vowel in the training set for the creation of a single /i/ model performs as well as training individual models for each /i/ variation.

Significance: Our results are of interest for the development of BCIs that aim to decode speech sounds from the sensorimotor cortex, since they argue that a multitude of cortical activity patterns associated with speech movements can be reduced to a basis set of models which reflect meaningful language units (vowels), yet it is important to account for the variety of neural patterns associated with a single sound in the training process.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Confusion matrices and average response for the different /i/ productions.
For each participant (A-E) the confusion matrix for classification of the different /i/ HFB patterns is shown on the left. The numbers indicate the percentage (%) of trials within the class (indicated by the rows) that were classified as the class indicated by the columns. On the right, the average HFB response (over electrodes and trials) is shown for each participant and each condition, with time on the x-axis and the HFB-power (in arbitrary units) on the y-axis. Voice onset and voice transition are shown by vertical red lines and the corresponding response peak times are shown by vertical black lines. Trials were aligned to voice transition and the voice onset standard deviation is indicated with a red horizontal bar. Note that since all trials are aligned to voice transition, there is no standard deviation for this. The sound before the /i/ is indicated without slashes to highlight the fact that classification specifically focused on the /i/ sound here.
Figure 2
Figure 2. Sensorimotor cortex electrodes that show differences in HFB amplitude between the different /i/ sound conditions.
The brain surface rendering of each subject (A-E) is shown. The sensorimotor cortex is highlighted in white and the central sulcus in red. For each electrode, an ANOVA was performed to test for differences between the HFB amplitudes during the different /i/ productions and the inverted p-values (1-p) thereof are shown here by the color shading of each electrode. Lighter colors represent statistically stronger differences between different /i/ sound conditions.
Figure 3
Figure 3. Average confusion matrices for classification of the three different sounds using the different /i/ template methods.
The average (over subjects) confusion matrices are shown for each of the three different methods used to create /i/ templates. The numbers indicate the percentage (%) of trials within the class that is indicated by the rows that were classified as the class indicated by the columns.
Figure 4
Figure 4. Average and individual classification accuracies for classification of the different sounds using the three different /i/ template methods.
In panel (a) the average (over subjects) classification accuracies are shown for each of the three different /i/ template methods. In panel (b) the same is shown for each individual separately. In panel (a), an asterisk indicates a significant difference in accuracy between methods. The horizontal red line indicates the average chance level (over subjects and methods).
Figure 5
Figure 5. The classification accuracy between the three conditions over time, using templates generated at moving time points.
The classification accuracy (y-axis) is shown at each time point (x-axis) between the three different conditions (/i/, /u/-/i/ & /a/-/i/), for each subject. Time point 0 is 2 seconds before transition cue onset. Templates were based on the same point in time as the time point of classification (rather than only at the peak neural activity, as used earlier). Voice onset, transition and offset are indicated by vertical red lines. The point at which originally, the /i/ templates were based (the peak in neural activity) is indicated by a dashed black line. Note, that for most participants the most separable moment was around the peak of neural activity. For subject E, the peak of neural activity was rather wide (see figure 1) which may explain the time difference between the accuracy peak and the neural response peak. Chance level (33.33%) is indicated by a thin horizontal line and the thick horizontal line indicates the accuracy level at which classification is significantly above chance level. Note that, at the onset of a trial, classification should be at chance level, since at this moment all three templates were based on rest and should not be able to predict the future, upcoming, pronunciation. Between voice onset and voice transition, the classification is between rest, the /u/ and the /a/ sound. Between voice transition and voice offset the classification is between the three different /i/ conditions (the isolated /i/, the /i/ with an /u/ before and the /i/ with an /a/ before), basically predicting if and what sound was said in the past (just before the /i/ sound).

Similar articles

Cited by

References

    1. American Congress of Rehabilitation Medicine. Recommendations for use of uniform nomenclature pertinent to patients with severe alterations in consciousness. Arch Phys Med Rehabil. 1995;76:205–9. - PubMed
    1. Smith E, Delargy M. Locked-in syndrome. BMJ. 2005;330:406–9. - PMC - PubMed
    1. Posner JB, Plum F, Saper CB, Schiff N. Plum and Posner’s Diagnosis of Stupor and Coma. Oxford University Press; USA: 2007.
    1. Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G, Vaughan TM. Brain-computer interfaces for communication and control. Clin Neurophysiol. 2002;113:767–91. - PubMed
    1. Vansteensel MJ, Pels EGM, Bleichner MG, Branco MP, Denison T, Freudenburg ZV, Gosselaar P, Leinders S, Ottens TH, Van Den Boom MA, Van Rijen PC, et al. Fully Implanted Brain–Computer Interface in a Locked-In Patient with ALS. N Engl J Med. 2016;375:2060–6. - PMC - PubMed

Publication types

LinkOut - more resources