Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr;3(4):393-405.
doi: 10.1038/s41562-019-0548-z. Epub 2019 Mar 4.

Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries

Affiliations

Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries

Adeen Flinker et al. Nat Hum Behav. 2019 Apr.

Abstract

The principles underlying functional asymmetries in cortex remain debated. For example, it is accepted that speech is processed bilaterally in auditory cortex, but a left hemisphere dominance emerges when the input is interpreted linguistically. The mechanisms, however, are contested, such as what sound features or processing principles underlie laterality. Recent findings across species (humans, canines and bats) provide converging evidence that spectrotemporal sound features drive asymmetrical responses. Typically, accounts invoke models wherein the hemispheres differ in time-frequency resolution or integration window size. We develop a framework that builds on and unifies prevailing models, using spectrotemporal modulation space. Using signal processing techniques motivated by neural responses, we test this approach, employing behavioural and neurophysiological measures. We show how psychophysical judgements align with spectrotemporal modulations and then characterize the neural sensitivities to temporal and spectral modulations. We demonstrate differential contributions from both hemispheres, with a left lateralization for temporal modulations and a weaker right lateralization for spectral modulations. We argue that representations in the modulation domain provide a more mechanistic basis to account for lateralization in auditory cortex.

PubMed Disclaimer

Conflict of interest statement

Competing Interests

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Time, Time-Frequency, and Modulation domain representations for sound waveforms. (a) Sound waveform of a spoken sentence (left panel, top) is shown along with its corresponding spectrogram representation (left panel, bottom). The spectrogram can be represented as a decomposition in the modulation domain (middle panel) of horizontal (temporal, typically as cycles per second) and vertical (spectral, typically as cycles per octave) modulations. The degree (power intensity) of temporal and spectral modulations in the spectrogram is depicted in the right panel (showing the average modulation spectra of all our speech material, N=84). Superimposed gray squares correspond to the approximate temporal and spectral ranges shown in the middle panel. (b) To provide better intuitive insight into these representations, two artificially created audio signals are shown with a temporal modulation peak (left panel, top waveform) and a spectral modulation peak (left panel, bottom waveform). For both audio signals representations are shown in the time domain (left), the time-frequency domain (middle), and the modulation domain (right).
Figure 2
Figure 2
A schematic depiction of the modulation asymmetry hypothesis whereby the left auditory system integrates a wide range of temporal modulations but a limited range of spectral modulations and the right auditory system integrates a wide range of spectral modulations but a limited range of temporal modulations.
Figure 3
Figure 3
Overview of the filtering technique used to produce modulation domain filtered stimuli. (a) An audio waveform is filtered in the frequency domain using a cochlear filter bank, and the subsequent spectrogram is Fourier transformed, two dimensionally, to produce a modulation domain representation. In the modulation domain, signals are low-pass filtered using a temporal (b) or spectral (c) cutoff and inversed Fourier transformed to produce the desired modulation-limited spectrogram. Finally, an iterative convex projection technique is employed to produce an audio signal that maximally matches the desired modulation-limited spectrogram.
Figure 4
Figure 4
Psychophysical performance as a function of temporal and spectral modulations in two separate experiments (diotic and dichotic). (a) Intelligibility (proportion of words transcribed correctly; N=20) at different temporal modulation cutoffs (left) and voice pitch identification (percent of correct responses; same participants N=20) at different spectral modulation cutoffs (right). Raw data is shown for each modulation value with mean and standard error of the mean across participants depicted in the error bars. Additionally the within-subject curve fits are shown with a solid curve depicting the mean and standard error of the mean across in the shaded area. Stimuli are repeated again in a subsequent block (block 2, in red). (b) The same tasks are used in a separate experiment employing dichotic stimulus presentation (N=60), with a different level of modulation information in each ear. Each value on the x-axis tick represents two different modulation values presented to each ear which are denoted in parentheses below the tick. The color code denotes in which ear the higher of the two values was presented (Blue, right ear was presented with a higher value; Cyan, left ear was presented with a higher value). Higher intelligibility is seen for sentences with more temporal modulations in the right ear compared to the left (dark blue curve). This right ear advantage is evident for temporal modulations but not spectral modulations.
Figure 5
Figure 5
MEG results showing significant correlations between neural power and degree of stimulus modulation. For each task, the average neural power for each modulation cutoff is shown (top) locked to the onset of a sentence (left most x-axis) as well as the offset (right most x-axis) with mean across sensors and participants and shaded error bars depicting standard error of the mean across participants (N=19). The correlations are shown in the middle panel as a function of time (mean across sensors and participants and shaded SEM across participants, N=19) and spatial sensor locations (mean across time and participants, N=19) in the topography plots above. Power estimates are projected to source space using a minimum norm estimate, and significant correlations in source space are shown in the bottom panels (mean correlations across participant, N=19).
Figure 6
Figure 6
Intracranial ECoG recordings in a patient with bilateral stereotactic depth electrodes, sampling superior temporal cortices. Reconstruction of electrode locations (MNI coordinates = [−60.17, −7.76, 2.82] , [52.81, −7.95, 2.64]) are shown (top) for an axial slice Z=3. Two significant electrodes with low frequency (0.1–8 Hz) mean (shaded SEM across trials, N=16) power traces shown for each modulation filter (middle) as well as average power (bottom) collapsed across time with mean and SEM across trials (N=16).

Comment in

  • The asymmetric auditory cortex.
    Hamilton LS. Hamilton LS. Nat Hum Behav. 2019 Apr;3(4):327-328. doi: 10.1038/s41562-019-0582-x. Nat Hum Behav. 2019. PMID: 30971803 No abstract available.

References

    1. Broca P Remarques sur le siege de la faculté du langage articulé, suivies d’une observation d’aphémie (perte de la parole). Bulletins et mémoires de la Société Anatomique de Paris 36, 330–356 (1861).
    1. Wernicke C Symptomenkomplex. Eine psychologische Studie auf anatomischer Basis. Cohn und Weigert, Breslau (1874).
    1. Hickok G & Poeppel D The cortical organization of speech processing. Nat. Rev. Neurosci 8, 393–402 (2007). - PubMed
    1. Hagoort P & Indefrey P The neurobiology of language beyond single words. Annu. Rev. Neurosci 37, 347–362 (2014). - PubMed
    1. Friederici AD The brain basis of language processing: from structure to function. Physiol. Rev 91, 1357–1392 (2011). - PubMed

Publication types