Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;9(3):e1002982.
doi: 10.1371/journal.pcbi.1002982. Epub 2013 Mar 28.

Sustained firing of model central auditory neurons yields a discriminative spectro-temporal representation for natural sounds

Affiliations

Sustained firing of model central auditory neurons yields a discriminative spectro-temporal representation for natural sounds

Michael A Carlin et al. PLoS Comput Biol. 2013.

Abstract

The processing characteristics of neurons in the central auditory system are directly shaped by and reflect the statistics of natural acoustic environments, but the principles that govern the relationship between natural sound ensembles and observed responses in neurophysiological studies remain unclear. In particular, accumulating evidence suggests the presence of a code based on sustained neural firing rates, where central auditory neurons exhibit strong, persistent responses to their preferred stimuli. Such a strategy can indicate the presence of ongoing sounds, is involved in parsing complex auditory scenes, and may play a role in matching neural dynamics to varying time scales in acoustic signals. In this paper, we describe a computational framework for exploring the influence of a code based on sustained firing rates on the shape of the spectro-temporal receptive field (STRF), a linear kernel that maps a spectro-temporal acoustic stimulus to the instantaneous firing rate of a central auditory neuron. We demonstrate the emergence of richly structured STRFs that capture the structure of natural sounds over a wide range of timescales, and show how the emergent ensembles resemble those commonly reported in physiological studies. Furthermore, we compare ensembles that optimize a sustained firing code with one that optimizes a sparse code, another widely considered coding strategy, and suggest how the resulting population responses are not mutually exclusive. Finally, we demonstrate how the emergent ensembles contour the high-energy spectro-temporal modulations of natural sounds, forming a discriminative representation that captures the full range of modulation statistics that characterize natural sound ensembles. These findings have direct implications for our understanding of how sensory systems encode the informative components of natural stimuli and potentially facilitate multi-sensory integration.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Schematic of the proposed framework.
Panel (A) shows an example of an auditory spectrogram for the speech utterance “serve on frankfurter buns…” whereas panel (B) illustrates how spectro-temporal patches are mapped to an ensemble of instantaneous neural firing rates.
Figure 2
Figure 2. Examples of emergent STRFs.
Shown are STRFs learned by optimizing (A) the sustained objective function formula image for formula image and (B) the sparsity objective function formula image. The examples shown here were drawn at random from ensembles of 400 neurons. The sustained STRFs are shown in order of decreasing contribution to the overall objective function whereas the sparse STRFs are shown randomly ordered. Each spectro-temporal patch spans 0–250 ms in time and 62.5–4000 Hz in frequency. For these examples the dynamic range of the STRFs was compressed using a formula image nonlinearity.
Figure 3
Figure 3. Spectral clustering results.
Shown are nine clusters obtained by pooling STRFs from the sparse as well as sustained ensembles for formula image10, 25, 50, 125, 250, 500, 1000, and 2500 ms. Shown in the center is a stacked bar chart where segment color corresponds to class label and segment width is proportional to the number of STRFs assigned to a particular class in a given ensemble. The surrounding panels show examples of STRFs drawn from six illustrative classes, namely, noisy, localized, spectral, complex, temporal, and directional.
Figure 4
Figure 4. Analysis of the temporal activations of emergent ensembles.
Panel (A) shows the median activation time of individual neurons (solid lines, sorted in decreasing order) for formula image10 and 125 ms, respectively, for STRFs that optimize the sustained objective function. The shaded region illustrates the corresponding interquartile range. Panel (B) shows the distributions (as boxplots) of median activation times of the top 10% “most persistent” neurons for sparse and sustained ensembles for increasing formula image.
Figure 5
Figure 5. Comparison of emergent STRFs learned according to the sustained objective function with examples estimated from ferret auditory cortex.
Figure 6
Figure 6. Cluster analysis of neural STRFs.
Illustration of the overlap between the eMTFs of neural STRF clusters and that of the response-constrained sustained objective model STRFs; class 9 comprised mostly noisy STRFs with an exceedingly broad eMTF and its contour is omitted here for clarity. The white contour corresponds to the model eMTF at the 65% level.
Figure 7
Figure 7. Ensemble analysis of STRFs learned under the sustained objective function for
formula image . In panels (A), (B), (C) and (E), the histograms show the distribution of model parameters whereas the thin green lines show the distribution of the physiological data. The black and green dashed vertical lines show population means for the model and neural data, respectively. In panels (D) and (F), the black and green lines correspond to the model and neural STRFs, respectively, with the dashed lines indicating 6-dB upper cutoff frequencies. Refer to the text for more details.
Figure 8
Figure 8. Average population response histograms for STRFs learned under the sustained and sparse objectives subject to response constraints.
Figure 9
Figure 9. Examples of STRFs learned under the sustained objective function () subject to orthonormality constraints on the shapes of the filters.
The examples shown here were drawn at random from an ensemble of 400 neurons, and the STRFs are shown in order of decreasing contribution to the overall objective function. Each spectro-temporal patch spans 0–250 ms in time and 62.5–4000 Hz in frequency. For these examples the dynamic range of the STRFs was compressed using a formula image nonlinearity.
Figure 10
Figure 10. Spectro-temporal modulations in the stimulus are fully captured by STRFs that promote sustained responses subject to response and shape constraints.
Here, the average MTF of the stimulus is overlaid with contours (at the 65% level) of the ensemble MTFs for both constraints for formula image. For each ensemble we also show the constellations for best rate vs. best scale (marked by ‘formula image’ and ‘formula image’ for response and shape constraints, respectively). For the response constraints, we show the contour line and BR/BS constellations for STRFs that contribute to 99% of the objective function.
Figure 11
Figure 11. Extracting basic spectro-temporal parameters for an individual STRF.
Panel (A) shows a typical STRF, with solid contour lines indicating those regions that exceed formula image one standard deviation. The dashed red line shows the projected 10-dB ellipse from which we estimated spectral bandwidth. As indicated, the STRF is rather elongated with no strong directional preference, and the pattern is highly separable. Panel (B) shows the MTF computed from the magnitude of the 2D Fourier Transform of the STRF in (A); from here we estimate formula image and formula image. Panel (C) shows the normalized temporal and spectral modulation profiles obtained from the MTF.

Similar articles

Cited by

References

    1. Simoncelli EP, Olshausen BA (2001) Natural image statistics and neural representation. Annu Rev Neurosci 24: 1193–1216. - PubMed
    1. Olshausen BA, Field DJ (2004) Sparse coding of sensory inputs. Curr Op Neurobio 14: 481–487. - PubMed
    1. Rosenblith WA, editor (1961) Sensory Communication. Cambridge (Massachusetts): MIT Press.
    1. Laughlin SB (2001) Energy as a constraint on the coding and processing of sensory information. Curr Op Neurobio 11: 475–480. - PubMed
    1. Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381: 607–609. - PubMed

Publication types

LinkOut - more resources