Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 May 15;22(10):4114-31.
doi: 10.1523/JNEUROSCI.22-10-04114.2002.

Nonlinear spectrotemporal sound analysis by neurons in the auditory midbrain

Affiliations

Nonlinear spectrotemporal sound analysis by neurons in the auditory midbrain

Monty A Escabi et al. J Neurosci. .

Abstract

The auditory system of humans and animals must process information from sounds that dynamically vary along multiple stimulus dimensions, including time, frequency, and intensity. Therefore, to understand neuronal mechanisms underlying acoustic processing in the central auditory pathway, it is essential to characterize how spectral and temporal acoustic dimensions are jointly processed by the brain. We use acoustic signals with a structurally rich time-varying spectrum to study linear and nonlinear spectrotemporal interactions in the central nucleus of the inferior colliculus (ICC). Our stimuli, the dynamic moving ripple (DMR) and ripple noise (RN), allow us to systematically characterize response attributes with the spectrotemporal receptive field (STRF) methods to a rich and dynamic stimulus ensemble. Theoretically, we expect that STRFs derived with DMR and RN would be identical for a linear integrating neuron, and we find that approximately 60% of ICC neurons meet this basic requirement. We find that the remaining neurons are distinctly nonlinear; these could either respond selectively to DMR or produce no STRFs despite selective activation to spectrotemporal acoustic attributes. Our findings delineate rules for spectrotemporal integration in the ICC that cannot be accounted for by conventional linear-energy integration models.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Synthetic sound sequence used for reverse correlation analysis (C, D) and some corresponding natural sound counterparts (A, kitten vocalizations;B, babbling brook). The DMR (C) is designed to mimic spectral profiles created by formants (spectral energy peaks) and temporal modulations in speech production and animal vocalizations. The ripple density parameter, Ω(t), corresponds to the number of energy peaks (cycles per octave) along the spectral axis at time t. The temporal modulation rate,Fm(t), describes the repetition rate of the envelope in hertz. The second stimulus, the RN (D), has noise-like properties that uniformly cover the ripple dimensions. The DMR and RN are shown for a maximum temporal modulation rate of 70 Hz, although a value of 350 Hz was used for the experiments.
Fig. 2.
Fig. 2.
Stimulus dynamics and spectrotemporal correlation statistics of the DMR and RN. The DMR parameter trajectories Ω(t) (A; ripple density, 0–4 cycles per octave) andFm(t) (B; modulation rate, −350–350 Hz) are shown for a short 15 sec segment. The spectrotemporal parameters efficiently cover the ripple space (C; shown for the 15 sec segments ofA, B). The instantaneous correlation function of the DMR (D) and RN (E) are shown for three distinct time instants,t1t3[D; left to right, Ω(t1) = 1 cycle per octave;Fm(t1) = 0 Hz; Ω(t2) = 2 cycles per octave;Fm(t2) = 150 Hz; Ω(t3) = 0.15 cycles per octave;Fm(t3) = −60 Hz]. The RN instantaneous correlation function consists of a narrow central peak and a noisy surround (E). The global autocorrelation is identical for both sounds, consisting of an impulse-like central peak of width 3 msec and one-fourth octave (D, E, far right).
Fig. 3.
Fig. 3.
Spike-triggered average and the STRF. At each instant of an action potential, the pre-event sound segment (up to 100 msec before spiking) is extracted and averaged for the entire stimulus ensemble. Red regions indicate stimulus patterns that were likely to be present whenever a neural response occurred at delay of 0. Blue indicates stimulus patterns that tended to be off at a moment before spike initiation. Functionally, these are interpreted as excitation (red) and inhibition (blue).
Fig. 4.
Fig. 4.
Spectrotemporal receptive fields of neurons that responded to DMR and RN. Neurons were tested with pure tones (A, D, left column), DMR (B, E, G, I,middle column), and the RN (C, F, H, J,right column) stimuli (individual neurons are shown byrow). Frequency-tuning curves depict the frequency versus intensity response area of a neuron (A, D). Thered horizontal line designates the mean stimulus level (per one-third octave) used for RN and DMR. STRFs have similar shapes (similarity index: B, C, 0.94; E, F, 0.76; G, H, 0.77; I, J, 0.7) and strength (magnitude disparity index: B, C, −13%; E, F, 178%; G, H, 74%; I, J, 4%; rate disparity index: B, C, −6%; E, F, 35%; G, H, 24%; I, J, −53%). To facilitate comparisons, STRFs are shown on identical color scales for RN and DMR. STRFs for each neuron are drawn on individually chosen spectral and temporal scales. Significant patterns of the STRF are denoted by red contours (p < 0.002 contour).
Fig. 5.
Fig. 5.
Frequency domain response analysis. The auditory STRF (A; shown for RN) is used to compute the RTF (B; shown for RN) by applying a two-dimensional Fourier transform. The RTF depicts time-locked energy in the neural response as a function of temporal modulation rate,Fm, and ripple density, Ω.Red indicates parameter combinations that evoked a strong time-locked response, and blue indicates a weak response. The CRH (D) characterizes nonlinear neuronal responses that do not show up in the STRF. For each neural event, the spectral and temporal DMR parameters, Ω(tk) andFm(tk), are determined at the time instance of the neural spike,tk. The values of Ω andFm are then used to increment the corresponding bin in the histogram by +1 (D).
Fig. 6.
Fig. 6.
Neuronal preferences determined with the RTF (left column) and CRH (right column) shown for the neurons of Figures 4B,I and 7J,G (A–D, respectively). The RTF and CRH depict the spectrotemporal frequency combinations (modulation frequency and ripple density) that preferentially activate a neuron. These can show either a low-pass or bandpass tuning profile along the temporal modulation or ripple density axis. Generally, neuronal tuning is similar for the RTF and CRH.
Fig. 7.
Fig. 7.
Spectrotemporal receptive fields of neurons that responded specifically to the DMR sound (B, D, G, J, middle column) but responded weakly or had no response to the RN (C, E, H, K, right column). Frequency-tuning curves derived with pure tones are shown for reference (A, F, I, left column). Red lines designate the mean stimulus level (per one-third octave) used for DMR and RN. Significant STRF patterns are denoted by red contours. All neurons are shown at distinct spectral and temporal scales.
Fig. 8.
Fig. 8.
Response statistics comparing the DMR versus the RN. The MDI and RDI quantify differences in mean firing rate and driven activity for the DMR and RN (A). Type II neurons have RDI and MDI values that exceed the 500% contour. STRF shape differences are quantified with the SI, which usually takes values from 0 (not similar) to 1 (very similar). The population similarity index distribution (B) is bimodally distributed, with the majority of neurons falling at ∼0.7.
Fig. 9.
Fig. 9.
Neurons for which the STRF procedure fails. STRFs derived with DMR (B, E, G, middle column) show no significant spectrotemporal patterns (p < 0.002) and, therefore, provide little information about stimulus features being encoded. Pure tone tuning curves are shown for reference (A, D, left column). Spectrotemporal feature selectivity is established with the conditioned response histogram (C, F, H,right column), which always shows a tuned response not observed directly from the STRF or the RTF.
Fig. 10.
Fig. 10.
Phase-locking statistics for the DMR stimulus. A, Left, A 5 sec segment of the DMR stimulus was presented to each neuron. Rastergrams (B–D, left) show individual response traces for 40 consecutive presentations for type I (B), II (C), and III (D) neurons. The occurrence of each action potential is shown as a single dot (1 msec resolution).A–C, Far right, Cutout (red) detailing the stimulus spectrotemporal envelope (A, far right) and the response rastergrams of each neuron (B–D, far right). E, The PLI measures the ability of a neuron to phase lock to the spectrotemporal envelope of a sound. A PLI of 0 indicates minimal linear phase locking, and a PLI of 1 indicates perfect phase locking. PLI distribution is skewed toward low values (mean PLI, 0.24) but extends over a broad range of 0–0.75.
Fig. 11.
Fig. 11.
pRTFs and best ripple parameter statistics. A, Scatter plot of the bRD and bTM for the observed neural responses (triangles, type I;circles, type II; squares, type III). The pRTF of type I (B), II (C), and III (D) neurons depicts the filtering profiles of each neural response type. Black contours designate the 95th percentile interval (boundaries that account for 95% of the area under the pRT).

References

    1. Aersten AMHJ, Olders JHJ, Johannesma PIM. Spectro-temporal receptive fields in auditory neurons in the grass frog: analysis of the stimulus-event relation for tonal stimulus. Biol Cybern. 1980;38:235–248. - PubMed
    1. Anzai A, Ohzawa I, Freeman RD. Neural mechanisms for processing binocular information: I. Simple cells. J Neurophysiol. 1999;82:891–908. - PubMed
    1. Attias H, Schreiner CE. Temporal low order statistics of natural sounds. Adv Neural Inf Process Syst. 1998;10:27–33.
    1. Bringuier V, Chavane F, Glaesr L, Frégnac Y. Horizontal propagation of visual activity in the synaptic integration field of are 17 neurons. Science. 1999;283:695–699. - PubMed
    1. Casseday JH, Ehrlich D, Covey D. Neural tuning for sound duration. Role of inhibitory mechanisms in the inferior colliculus. Science. 1994;264:847–850. - PubMed

Publication types

LinkOut - more resources