Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 1;25(5):408-422.
doi: 10.1016/j.jneuroling.2009.08.006.

A Neural Theory of Speech Acquisition and Production

Affiliations

A Neural Theory of Speech Acquisition and Production

Frank H Guenther et al. J Neurolinguistics. .

Abstract

This article describes a computational model, called DIVA, that provides a quantitative framework for understanding the roles of various brain regions involved in speech acquisition and production. An overview of the DIVA model is first provided, along with descriptions of the computations performed in the different brain regions represented in the model. Particular focus is given to the model's speech sound map, which provides a link between the sensory representation of a speech sound and the motor program for that sound. Neurons in this map share with "mirror neurons" described in monkey ventral premotor cortex the key property of being active during both production and perception of specific motor actions. As the DIVA model is defined both computationally and anatomically, it is ideal for generating precise predictions concerning speech-related brain activation patterns observed during functional imaging experiments. The DIVA model thus provides a well-defined framework for guiding the interpretation of experimental results related to the putative human speech mirror system.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of the cortical components of the DIVA model of speech acquisition and production. The mediating neural representation linking auditory and motor reference frames is the speech sound map, proposed to reside in the left posterior inferior frontal gyrus (Broca’s area) and adjoining ventral premotor cortex. Additional details of the model are described in the text.
Figure 2
Figure 2
Top. Lateral surfaces of the brain indicating locations of significant activations (random effects; statistics controlled at a false discovery rate of 0.05) measured in an fMRI experiment of single syllable production (speech - baseline contrast, where the baseline task consisted of silently viewing the letters YYY on the video screen). Middle right. Lateral surface of the brain indicating locations of the DIVA model components in the left hemisphere. Medial regions (superior paravermal cerebellum and deep cerebellar nuclei) are not visible. Unless otherwise noted, labels along the central sulcus correspond to the motor (anterior) and somatosensory (posterior) representation for each articulator. Bottom. Simulated fMRI activations from the DIVA model when performing the same speech task as the subjects in the fMRI experiment. [Abbreviations: Aud = auditory state neurons; ΔA = auditory error neurons; ΔS = somatosensory error neurons; Lat Cbm = superior lateral cerebellum; Resp = motor respiratory region; SSM = speech sound map. *Palate representation is somatosensory only. Respiratory representation is motor only.]
Figure 3
Figure 3
Timeline for a single trial in the fMRI speech perturbation protocol. The subject reads the stimulus out loud during stimulus presentation, when the scanner is not collecting images and is thus quiet. Images are acquired approximately 2 seconds after articulation ceases. [Abbreviations: HR = estimated hemodynamic response; A1,A2 = acquisition periods of two full brain scans.]
Figure 4
Figure 4
Regions of significant activation in the perturbed speech - unperturbed speech contrast of an fMRI speech perturbation experiment investigating the effects of unexpected perturbation of auditory feedback (30% shift of the first formant frequency during single word reading. Peak activations were found in the superior temporal gyrus bilaterally and right hemisphere ventral premotor cortex/inferior frontal gyrus.
Figure 5
Figure 5
Comparison of first formant frequency (F1) trajectories produced by the DIVA model (lines) and human subjects (shaded regions) when F1 is unexpectedly perturbed during production of a syllable. Utterances were perturbed by shifting F1 upward or downward by 30% throughout the syllable. Traces are shown for 300 ms starting from the onset of the perturbation at the beginning of vocalization. Shaded areas denote the 95% confidence interval for normalized F1 values during upward (dark) and downward (light) perturbations in the experimental study. Lines indicate values obtained from a DIVA model simulation of the auditory perturbation experiment. Both the model and the experimental subjects show compensation for the perturbation starting approximately 75-150 ms after perturbation onset.
Figure 6
Figure 6
The left half of the figure shows regions of significant activation in the perturbed speech - unperturbed speech contrast of an fMRI experiment investigating the effects of unexpected jaw perturbation during single word reading. The right half of the figure shows the results of simulations of the DIVA model during jaw-perturbed speech made prior to the experiment.
Figure 7
Figure 7
Adaptive response to systematic perturbation of F1 during a sensorimotor adaptation experiment (solid lines) compared to DIVA model simulations of the same experiment (shaded area). The solid line with standard error bars represents experimental data collected from 20 subjects. The shaded region represents the 95% confidence interval derived from DIVA model simulations. The total duration of the experiment (65 epochs) was approximately 100 minutes. The horizontal dashed line indicates the baseline F1 value.

Similar articles

Cited by

References

    1. Ackermann H, Vogel M, Petersen D, Poremba M. Speech deficits in ischaemic cerebellar lesions. Journal of Neurology. 1992;239:223–227. - PubMed
    1. Barlow SM. Handbook of clinical speech physiology. Singular; San Diego: 1999.
    1. Bonaiuto J, Rosta E, Arbib M. Extending the mirror neuron system model, I. Audible actions and invisible grasps. Biological Cybernetics. 2007;96:9–38. - PubMed
    1. Browman CP, Goldstein L. Articulatory gestures as phonological units. Phonology. 1989;6:201–251.
    1. Buchsbaum BR, Hickok G, Humphries C. Role of left posterior superior temporal gyrus in phonological processing for speech perception and production. Cognitive Science. 2001;25:663–678.

LinkOut - more resources