Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 14;38(46):9803-9813.
doi: 10.1523/JNEUROSCI.1206-18.2018. Epub 2018 Sep 26.

Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri

Affiliations

Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri

Emily M Mugler et al. J Neurosci. .

Abstract

Speech is a critical form of human communication and is central to our daily lives. Yet, despite decades of study, an understanding of the fundamental neural control of speech production remains incomplete. Current theories model speech production as a hierarchy from sentences and phrases down to words, syllables, speech sounds (phonemes), and the actions of vocal tract articulators used to produce speech sounds (articulatory gestures). Here, we investigate the cortical representation of articulatory gestures and phonemes in ventral precentral and inferior frontal gyri in men and women. Our results indicate that ventral precentral cortex represents gestures to a greater extent than phonemes, while inferior frontal cortex represents both gestures and phonemes. These findings suggest that speech production shares a common cortical representation with that of other types of movement, such as arm and hand movements. This has important implications both for our understanding of speech production and for the design of brain-machine interfaces to restore communication to people who cannot speak.SIGNIFICANCE STATEMENT Despite being studied for decades, the production of speech by the brain is not fully understood. In particular, the most elemental parts of speech, speech sounds (phonemes) and the movements of vocal tract articulators used to produce these sounds (articulatory gestures), have both been hypothesized to be encoded in motor cortex. Using direct cortical recordings, we found evidence that primary motor and premotor cortices represent gestures to a greater extent than phonemes. Inferior frontal cortex (part of Broca's area) appears to represent both gestures and phonemes. These findings suggest that speech production shares a similar cortical organizational structure with the movement of other body parts.

Keywords: articulatory gestures; brain–computer interface; encoding; phonemes; segments; speech production.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Defining phoneme and articulatory gesture onsets. A, Cerebral cortex of Subject 5 (S5) with recorded regions of speech motor cortex highlighted: IFG (green), aPCG (blue), and pPCG (purple). B, Vocal tract with positions of the lips, tongue body, and tongue tip during production of a single word. Each trace represents the position, at 10-ms intervals, generated by the AAI model, from word onset (green) to word offset (magenta; see corresponding colors in C). C, Example audio signal, and corresponding audio spectrogram, from S5 with labeled phonemic event onsets (blue vertical lines) mapped to vocal tract articulatory gesture positions. Target apertures for each articulatory gesture action are marked from open (open circle), to critical (half-filled circle), to closed (filled circle). Note that larynx has opposite open/close orientation as its default configuration is assumed to be near closure (vibrating; Browman and Goldstein, 1992). Also note that while the initial and final consonants are associated with a specific velum-closing action, the vowel does not specify such a gesture (thus, the state of the velum during the vowel depends on the surrounding gestures).
Figure 2.
Figure 2.
Electrode array locations for all nine subjects. Top schematic shows the approximate area of cortex (rectangle) displayed for each subject. Shaded areas represent the different cortical areas: IFG (green), aPCG (blue), and pPCG (purple). Note that Subject 2 was implanted in the right hemisphere and so anterior–posterior direction is reversed. IFG electrodes in Subject 9 were excluded because they were too close to the tumor margin. CS, Central sulcus; SF, Sylvian fissure.
Figure 3.
Figure 3.
Variation of cortical activity with intraword position of phonemes and gestures. Phoneme-related activity changes with context, while gesture-related activity does not. A, Mean (±SD; shaded areas) high gamma activity on two electrodes in subject Subject 5 aligned to onset of the phoneme (left) or gesture (right) event. Activity is separated into instances of all events [/t/ or /k/ for phonemes, tongue tip closure (TTC) or tongue body closure (TBC) for gestures] occurring either at the beginning of a word (light green) or at the end of a word (dark green). Gray dashed lines represent a-100 to 50 ms interval around onset. B, An example of classification accuracy (mean ± 95% CI) of intraword position on one electrode (e56) related to either tongue body (left, same as bottom plots in A) or tongue tip (e46; right, same as top plots in A) in S5 for phonemes (blue) and gestures (red). Gestural position classification does not outperform chance (gray), while phonemic position classification performs significantly higher than chance. C, Spatial distribution of d′ for differences between phonemic and gestural position accuracy and chance. Phonemic position accuracy is much higher than chance while gestural position accuracy is not on tongue tip- and tongue body-related electrodes (outlined electrodes). Shaded areas correspond to cortical areas as in Figure 2A.
Figure 4.
Figure 4.
Classification of phonemes and gestures. A, Mean (±SEM over subjects) classification accuracy using combined aPCG and pPCG activity of phonemes (blue squares) and gestures (red circles). Shown are both raw accuracy (left; dotted lines showing chance accuracy) and accuracy relative to chance (right). Gestures were classified significantly (*) more accurately than phonemes. B, Classification accuracy for phonemes (blue) and gestures (red) using activity from IFG, aPCG, and pPCG separately, for subject S5 (left; ±SD) and population mean (right; ±SEM). C, Accuracy relative to chance in each area for S5 (left) and population mean (right). Gesture classification was significantly higher than phoneme classification in pPCG and aPCG (*). D, d′ values (mean ± SEM over subjects) between gesture and phoneme accuracies in each area.
Figure 5.
Figure 5.
Classification of consonant allophones using ECoG from each cortical area. A, Examples of audio waveforms, averaged spectrograms, and gestures for an allophone set ({/t/,/st/,/d/}) aligned to vowel onset (black vertical line). Only the trajectories for articulators that show differences for these phonemes are depicted (filled circle, close; open circle, open; half-filled, partial closure (critical)). Colors throughout the figure represent VLC (/t/, blue), VC (/d/, orange), and CClA (/st/, gray). B, Examples of normalized high gamma activity (mean ± SE) at three electrodes during /t/, /d/, and /st/ production in S5. Allophone onset is at time 0. One electrode from each cortical area is shown. CClA activity (gray) in these IFG and aPCG electrodes is more similar to the VLC (blue), especially at approximately time 0, while in pPCG it is more similar to VC (orange). C, Schematic depicting three different idealized performance patterns in a single cortical area. Solid circles denote the performance of the classification of VLCs (blue) and VCs (orange) using their respective classifiers. Gray-filled circles denote CClA classification performance using the VLC (blue outline) and VC (orange outline) classifiers. High CClA performance (close to that of the respective solid color) would indicate that the allophone behaved more like the VLC or VC than like other consonants in the dataset. Blue rectangle, CClA performed similarly to the VLC; orange rectangle, CClA performed similarly to the VC; green rectangle, CClA performed differently than both VLCs and VCs. D, Classification performance (mean ± SEM across subjects and allophone sets) in each cortical area of VLCs and CClAs in voiceless classifiers, and VCs and CClAs in voiced classifiers. CClAs show much lower performance on VLC classifiers than VLCs perform in pPCG, while the performance is much closer in IFG and aPCG. The opposite trend occurs with CClA performance on the VC classifiers. E, d'values (mean ± SEM across subjects and sets) between the singlet consonant performance and allophone consonant performance for each area; larger values are more discriminable. Blue circles, VLC vs CClA performance using VLC classifiers; orange circles, VC vs CClA performance using VC classifiers. In summary, CClAs perform more like VLCs and less like VCs moving from posterior to anterior.

References

    1. Bakovic E. (2014) Phonology and phonological theory. In: The Oxford handbook of language production (Goldrick MA, Ferreira V, Miozzo M, eds), pp 199–209. Oxford: Oxford UP.
    1. Ballard KJ, Granier JP, Robin DA (2000) Understanding the nature of apraxia of speech: theory, analysis, and treatment. Aphasiology 14:969–995. 10.1080/02687030050156575 - DOI
    1. Bocquelet F, Hueber T, Girin L, Savariaux C, Yvert B (2016) Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLoS Comput Biol 12:e1005119. 10.1371/journal.pcbi.1005119 - DOI - PMC - PubMed
    1. Bouchard KE, Chang EF (2014) Control of spoken vowel acoustics and the influence of phonetic context in human speech sensorimotor cortex. J Neurosci 34:12662–12677. 10.1523/JNEUROSCI.1219-14.2014 - DOI - PMC - PubMed
    1. Bouchard KE, Mesgarani N, Johnson K, Chang EF (2013) Functional organization of human sensorimotor cortex for speech articulation. Nature 495:327–332. 10.1038/nature11911 - DOI - PMC - PubMed

Publication types

LinkOut - more resources