Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

doi:10.1038/nn.2331

Review

. 2009 Jun;12(6):718-24.

doi: 10.1038/nn.2331. Epub 2009 May 26.

Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

Josef P Rauschecker¹, Sophie K Scott

Affiliations

PMID: 19471271
PMCID: PMC2846110
DOI: 10.1038/nn.2331

Review

Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

Josef P Rauschecker et al. Nat Neurosci. 2009 Jun.

. 2009 Jun;12(6):718-24.

doi: 10.1038/nn.2331. Epub 2009 May 26.

Authors

Josef P Rauschecker¹, Sophie K Scott

Affiliation

¹ Laboratory of Integrative Neuroscience and Cognition, Georgetown University Medical Center, Washington, DC, USA. rauschej@georgetown.edu

PMID: 19471271
PMCID: PMC2846110
DOI: 10.1038/nn.2331

Abstract

Speech and language are considered uniquely human abilities: animals have communication systems, but they do not match human linguistic skills in terms of recursive structure and combinatorial power. Yet, in evolution, spoken language must have emerged from neural mechanisms at least partially available in animals. In this paper, we will demonstrate how our understanding of speech perception, one important facet of language, has profited from findings and theory in nonhuman primate studies. Chief among these are physiological and anatomical studies showing that primate auditory cortex, across species, shows patterns of hierarchical structure, topographic mapping and streams of functional processing. We will identify roles for different cortical areas in the perceptual processing of speech and review functional imaging work in humans that bears on our understanding of how the brain decodes and monitors speech. A new model connects structures in the temporal, frontal and parietal lobes linking speech perception and production.

PubMed Disclaimer

Figures

**Figure 1**
Dual processing scheme for ‘what’ and ‘where’, proposed for nonhuman primates on anatomical and physiological grounds. V1, primary visual cortex; A1, primary auditory cortex; IT, inferior temporal region; ST, superior temporal region; PPC, posterior parietal cortex; VLPFC, ventrolateral prefrontal cortex; DLPFC, dorsolateral prefrontal cortex. (Simplified from refs. 4, and combined with an existing scheme from the visual system from ref. 6.)

**Figure 2**
Communication calls consist of elementary features, such as bandpass noise bursts or frequency-modulated (FM) sweeps. Harmonic calls, such as the vocal scream from the rhesus monkey repertoire depicted here by its spectrogram and time signal amplitude (A, measured as output voltage of a sound meter), consist of fundamental frequencies and higher harmonics. The neural circuitry for processing such calls is thought to consist of small hierarchical networks. At the lowest level, there are neurons serving as FM detectors tuned to the rate and direction of FM sweeps; these detectors extract each FM component (shown in cartoon spectrograms) in the upward and downward sweeps of the scream. The output of these FM detectors is combined nonlinearly at the next level: the target neurons T1 and T2 possess a high threshold and fire only if all inputs are activated. At the final level, a ‘tonal-scream detector’ is created by again combining output from neurons T1 and T2 nonlinearly. Temporal integration is accomplished by having the output of T1 pass through a delay line with a latency Δt1 sufficient to hold up the input to the top neuron long enough that all inputs arrive at the same time. Early processing of human speech sounds in the antero-lateral auditory belt and parabelt cortex is thought to be accomplished in a similar way.

**Figure 3**
Multiple parallel input modules advocated by some as an alternative to the dual-stream model. According to this model, sensory information at the cortical level originates from primary-like areas (A1 and R in the auditory system; R is also referred to as A2 by analogy to visual area V2) and splits into multiple early processing streams: an object stream (green) originating from the antero-lateral belt (AL; or “A4” by analogy to area V4, involved in processing visual form); a spatial stream (red) originating from the caudo-lateral belt (CL; or “A5” by analogy to visual motion area V5); and other streams or streamlets originating from either area ML between AL and CL (“A3” by analogy to visual area V3) or from the medial belt (MB). RPB and CPB, rostral and caudal parabelt; T2 and T3, temporal cortical areas as defined by Burton and Jones; TPO, polymodal cortex in the upper bank of superior temporal sulcus; Tpt, parieto-temporal area.

**Figure 4**
Invariance in the perception of auditory objects (including vocalizations and speech) against transpositions in frequency, time or both. (a) Frequency-shifted monkey calls are behaviorally classified as the same by monkeys, presumably reflecting the response of higher-order neurons in anterior superior temporal cortex, even though the frequency contents of the monkey calls are markedly different. The example shows spectrograms of a tonal scream from a rhesus monkey frequency-shifted in steps of one octave. (b) Spectrograms of clear human speech (top) and of a six-channel noise-vocoded transformation of it (bottom). The noise-vocoded version of the sentence (“They're buying some bread.”) is easily comprehensible after short training, even though the sound is very impoverished in the spectral domain.

**Figure 5**
Dual auditory processing scheme of the human brain and the role of internal models in sensory systems. This expanded scheme closes the loop between speech perception and production and proposes a common computational structure for space processing and speech control in the postero-dorsal auditory stream. (a) Antero-ventral (green) and postero-dorsal (red) streams originating from the auditory belt. The postero-dorsal stream interfaces with premotor areas and pivots around inferior parietal cortex, where a quick sketch of sensory event information is compared with a predictive efference copy of motor plans. (b) In one direction, the model performs a forward mapping: object information, such as speech, is decoded in the antero-ventral stream all the way to category-invariant inferior frontal cortex (area 45), and is transformed into motor-articulatory representations (area 44 and ventral PMC), whose activation is transmitted to the IPL (and posterior superior temporal cortex) as an efference copy. (c) In reverse direction, the model performs an inverse mapping, whereby attention- or intention-related changes in the IPL, influence the selection of context-dependent action programs in PFC and PMC. Both types of dynamic model are testable using techniques with high temporal precision (for example, magnetoencephalography in humans or single-unit studies in monkeys) that allow determination of the order of events in the respective neural systems. AC, auditory cortex; STS, superior temporal sulcus; IFC, inferior frontal cortex, PMC, premotor cortex; IPL, inferior parietal lobule; CS, central sulcus. Numbers correspond to Brodmann areas.

See this image and copyright information in PMC

Cited by

A Bayesian probit model with spatially varying coefficients for brain decoding using fMRI data.
Zhang F, Jiang W, Wong P, Wang JP. Zhang F, et al. Stat Med. 2016 Oct 30;35(24):4380-4397. doi: 10.1002/sim.6999. Epub 2016 May 24. Stat Med. 2016. PMID: 27222305 Free PMC article.
Extracting Phonetic Features From Natural Classes: A Mismatch Negativity Study of Mandarin Chinese Retroflex Consonants.
Fu Z, Monahan PJ. Fu Z, et al. Front Hum Neurosci. 2021 Mar 24;15:609898. doi: 10.3389/fnhum.2021.609898. eCollection 2021. Front Hum Neurosci. 2021. PMID: 33841113 Free PMC article.
A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading.
Price CJ. Price CJ. Neuroimage. 2012 Aug 15;62(2):816-47. doi: 10.1016/j.neuroimage.2012.04.062. Epub 2012 May 12. Neuroimage. 2012. PMID: 22584224 Free PMC article. Review.
The neural control of singing.
Zarate JM. Zarate JM. Front Hum Neurosci. 2013 Jun 3;7:237. doi: 10.3389/fnhum.2013.00237. eCollection 2013. Front Hum Neurosci. 2013. PMID: 23761746 Free PMC article.
Auditory properties in the parabelt regions of the superior temporal gyrus in the awake macaque monkey: an initial survey.
Kajikawa Y, Frey S, Ross D, Falchier A, Hackett TA, Schroeder CE. Kajikawa Y, et al. J Neurosci. 2015 Mar 11;35(10):4140-50. doi: 10.1523/JNEUROSCI.3556-14.2015. J Neurosci. 2015. PMID: 25762661 Free PMC article.

See all "Cited by" articles

References

1. Broca P. Remarques sur le siège de la facultè du language articulè: suivies d'une observation d'aphèmie (perte de la parole) Bull Soc Anat Paris. 1861;6:330–357.
1. Wernicke C. Der aphasische Symptomencomplex: Eine psychologische Studie auf anatomischer Basis. Cohn & Weigert, Breslau; Germany: 1874.
1. Wise RJ. Language systems in normal and aphasic human subjects: functional imaging studies and inferences from animal studies. Br Med Bull. 2003;65:95–119. - PubMed
1. Rauschecker JP. Cortical processing of complex sounds. Curr Opin Neurobiol. 1998;8:516–521. - PubMed
1. Rauschecker JP, Tian B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci USA. 2000;97:11800–11806. - PMC - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

[1] Broca P. Remarques sur le siège de la facultè du language articulè: suivies d'une observation d'aphèmie (perte de la parole) Bull Soc Anat Paris. 1861;6:330–357.

[2] Broca P. Remarques sur le siège de la facultè du language articulè: suivies d'une observation d'aphèmie (perte de la parole) Bull Soc Anat Paris. 1861;6:330–357.

[3] Wernicke C. Der aphasische Symptomencomplex: Eine psychologische Studie auf anatomischer Basis. Cohn & Weigert, Breslau; Germany: 1874.

[4] Wernicke C. Der aphasische Symptomencomplex: Eine psychologische Studie auf anatomischer Basis. Cohn & Weigert, Breslau; Germany: 1874.

[5] Wise RJ. Language systems in normal and aphasic human subjects: functional imaging studies and inferences from animal studies. Br Med Bull. 2003;65:95–119. - PubMed

[6] Wise RJ. Language systems in normal and aphasic human subjects: functional imaging studies and inferences from animal studies. Br Med Bull. 2003;65:95–119. - PubMed

[7] Rauschecker JP. Cortical processing of complex sounds. Curr Opin Neurobiol. 1998;8:516–521. - PubMed

[8] Rauschecker JP. Cortical processing of complex sounds. Curr Opin Neurobiol. 1998;8:516–521. - PubMed

[9] Rauschecker JP, Tian B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci USA. 2000;97:11800–11806. - PMC - PubMed

[10] Rauschecker JP, Tian B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci USA. 2000;97:11800–11806. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

Affiliation

Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources