Neural specializations for speech and pitch: moving beyond the dichotomies

Robert J Zatorre¹, Jackson T Gandour

Affiliations

PMID: 17890188
PMCID: PMC2606798
DOI: 10.1098/rstb.2007.2161

Review

Neural specializations for speech and pitch: moving beyond the dichotomies

Robert J Zatorre et al. Philos Trans R Soc Lond B Biol Sci. 2008.

. 2008 Mar 12;363(1493):1087-104.

doi: 10.1098/rstb.2007.2161.

Authors

Robert J Zatorre¹, Jackson T Gandour

Affiliation

¹ Montreal Neurological Institute, McGill University, 3801 University Street, Montreal, H3A 3B4 Quebec, Canada. robert.zatorre@mcgill.ca

PMID: 17890188
PMCID: PMC2606798
DOI: 10.1098/rstb.2007.2161

Abstract

The idea that speech processing relies on unique, encapsulated, domain-specific mechanisms has been around for some time. Another well-known idea, often espoused as being in opposition to the first proposal, is that processing of speech sounds entails general-purpose neural mechanisms sensitive to the acoustic features that are present in speech. Here, we suggest that these dichotomous views need not be mutually exclusive. Specifically, there is now extensive evidence that spectral and temporal acoustical properties predict the relative specialization of right and left auditory cortices, and that this is a parsimonious way to account not only for the processing of speech sounds, but also for non-speech sounds such as musical tones. We also point out that there is equally compelling evidence that neural responses elicited by speech sounds can differ depending on more abstract, linguistically relevant properties of a stimulus (such as whether it forms part of one's language or not). Tonal languages provide a particularly valuable window to understand the interplay between these processes. The key to reconciling these phenomena probably lies in understanding the interactions between afferent pathways that carry stimulus information, with top-down processing mechanisms that modulate these processes. Although we are still far from the point of having a complete picture, we argue that moving forward will require us to abandon the dichotomy argument in favour of a more integrated approach.

PubMed Disclaimer

Figures

**Figure 1**
Hemispheric differences in auditory cortex elicited by noise stimuli. (a,b) Illustration of how noise stimuli were constructed; each matrix illustrates stimuli with different bandwidths (on the ordinate) and different temporal modulation rates (on the abscissa). (c) fMRI results indicating bilateral recruitment of distinct cortical areas for increasing rate of temporal or spectral modulation. (d) Effect sizes in selected areas of right (r) and left (l) auditory cortices. Note significant interaction between left anterolateral region, which responds more to temporal than to spectral modulation, and right anterolateral region, which responds more to spectral than to temporal modulation. HG, Heschl's gyrus (Schönwiesner *et al*. 2005).

**Figure 2**
(a) Illustrations of consonant–vowel (CV) speech and non-speech sounds containing similar acoustical properties. (b) fMRI brain images illustrating an overlap between left auditory cortex responses to syllables and to non-speech gap stimuli in left auditory cortex. Similar results are obtained at two distinct time points (TPA1 and TPA2; Zaehle *et al*. 2004).

**Figure 3**
Magnetic-evoked potentials to a non-speech sound presented in different contexts. A left-hemisphere advantage is observed only when the target sound is embedded in (a,b) a real word (verb or noun, respectively), but not when it is presented in either (c) a non-speech context or (d) a pseudoword context, where it is perceived as speech but not as a recognizable word (Shtyrov *et al*. 2005).

**Figure 4**
Cortical activation maps comparing discrimination of pitch and duration patterns in a speech relative to a non-speech condition for groups of Thai and Chinese listeners. (a) Spectral contrast between Thai tones (speech) and pitch (non-speech) and (b) temporal contrast between Thai vowel length (speech) and duration (non-speech). Left-sided activation foci in frontal and temporo-occipital regions occur in the Thai group only. T, tone; P, pitch; VL, vowel length; D, duration (Gandour *et al*. 2002a,b).

**Figure 5**
Cortical areas activated in response to discrimination of Chinese and Thai tones. A common focus of activation is indicated by the overlap (yellow) between Chinese and Thai groups in the functional activation maps. Green cross-hair lines mark the stereotactic centre coordinates for the overlapping region in the left planum temporale ((a) coronal section, y=−25; (b) sagittal section, x=−44; (c) axial section, z=+7). A double dissociation ((d) bar charts) between tonal processing and language experience reveals that for the Thai group, Thai tones elicit stronger activity relative to Chinese tones, whereas for the Chinese group, stronger activity is elicited by Chinese tones relative to Thai tones. C^C, Chinese tones superimposed on Chinese, i.e. Chinese words; C^T, Thai tones superimposed on Chinese syllable, i.e. *tonal chimeras*; L/LH, left hemisphere; R/RH, right hemisphere; ROI, region of interest (Xu *et al*. 2006).

**Figure 6**
Laterality effects for ROIs in (a) the Chinese group only and in (b) both Chinese and English groups as rendered on a three-dimensional LH template for common reference. (a) In the Chinese group, the ventral aspects of the inferior parietal lobule and anterior and posterior STG are lateralized to the LH across tone and intonation tasks; the anterior MFG and intraparietal sulcus for a subset of tasks. (b) In both groups, the middle portions of the MFG and STS are lateralized to the RH across tasks. This right-sided fronto-temporal network subserves pitch processing regardless of its linguistic function. Other ROIs do not show laterality effects. MFG, middle frontal gyrus; STG, superior temporal gyrus (Gandour *et al*. 2004).

**Figure 7**
Grand-average f₀ contours of Mandarin tone 2 derived from the FFR waveforms of all subjects across both ears in the Chinese and English groups. The f₀ contour of the original speech stimulus is displayed in black. The enlarged inset shows that the f₀ contour derived from the FFR waveforms of the Chinese group more closely approximates that of the original stimulus (yi² ‘aunt’) when compared with the English group. (Krishnan *et al*. 2005).

See this image and copyright information in PMC

References

1. Bavelier D, Corina D.P, Neville H.J. Brain and language: a perspective from sign language. Neuron. 1998;21:275–278. doi:10.1016/S0896-6273(00)80536-X - DOI - PubMed
1. Belin P, Zilbovicius M, Crozier S, Thivard L, Fontaine A, Masure M.-C, Samson Y. Lateralization of speech and auditory temporal processing. J. Cogn. Neurosci. 1998;10:536–540. doi:10.1162/089892998562834 - DOI - PubMed
1. Binder J, Frost J, Hammeke T, Cox R, Rao S, Prieto T. Human brain language areas identified by functional magnetic resonance imaging. J. Neurosci. 1997;17:353–362. - PMC - PubMed
1. Binder J, Frost J, Hammeke T, Bellgowan P, Springer J, Kaufman J, Possing J. Human temporal lobe activation by speech and nonspeech sounds. Cereb. Cortex. 2000;10:512–528. doi:10.1093/cercor/10.5.512 - DOI - PubMed
1. Blumstein S.E. The neurobiology of the sound structure of language. In: Gazzaniga M.S, editor. The cognitive neurosciences. MIT Press; Cambridge, MA: 1994. pp. 915–929.

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Neural specializations for speech and pitch: moving beyond the dichotomies

Affiliation

Neural specializations for speech and pitch: moving beyond the dichotomies

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources