Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 1:360:109177.
doi: 10.1016/j.jneumeth.2021.109177. Epub 2021 Apr 9.

Computational framework for investigating predictive processing in auditory perception

Affiliations

Computational framework for investigating predictive processing in auditory perception

Benjamin Skerritt-Davis et al. J Neurosci Methods. .

Abstract

Background: The brain tracks sound sources as they evolve in time, collecting contextual information to predict future sensory inputs. Previous work in predictive coding typically focuses on the perception of predictable stimuli, leaving the implementation of these same neural processes in more complex, real-world environments containing randomness and uncertainty up for debate.

New method: To facilitate investigation into the perception of less tightly-controlled listening scenarios, we present a computational model as a tool to ask targeted questions about the underlying predictive processes that connect complex sensory inputs to listener behavior and neural responses. In the modeling framework, observed sound features (e.g. pitch) are tracked sequentially using Bayesian inference. Sufficient statistics are inferred from past observations at multiple time scales and used to make predictions about future observation while tracking the statistical structure of the sensory input.

Results: Facets of the model are discussed in terms of their application to perceptual research, and examples taken from real-world audio demonstrate the model's flexibility to capture a variety of statistical structures along various perceptual dimensions.

Comparison with existing methods: Previous models are often targeted toward interpreting a particular experimental paradigm (e.g., oddball paradigm), perceptual dimension (e.g., pitch processing), or task (e.g., speech segregation), thus limiting their ability to generalize to other domains. The presented model is designed as a flexible and practical tool for broad application.

Conclusion: The model is presented as a general framework for generating new hypotheses and guiding investigation into the neural processes underlying predictive coding of complex scenes.

Keywords: Bayesian inference; Predictive coding; Statistical inference; neural decoding; uncertainty.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Model description. a) The model uses multiple context hypotheses to account for unknown changes in the observed sequence. Context-specific predictions Pt based on sufficient statistics Θt are combined, weighted by corresponding beliefs Bt, to yield the predictive distribution Ψt for the next input xt+1. b) Upon observing xt+1, the predictions and new input are used to update the statistics and beliefs, which are used in turn to predict the next input, and so on. There are three principal outputs from the model at each time: the surprisal of the newly observed input based on its prediction, the predictive distribution for the next input, and the beliefs (or posterior distribution over contexts). c) Outputs from the model for the example sequence in a). The top panel shows the predictive distribution at each time (in blue) with the input sequence overlaid, the middle panel shows the context beliefs, with each row corresponding to a particular context hypothesis ci, and the bottom panel shows the surprisal for each input observation. Note the predictive distribution and context beliefs reflect the underlying change in statistics inferred by the model.
Figure 2:
Figure 2:
Model outputs for example inputs from real-world audio clips. Each panel displays the model predictive distribution (top), context beliefs (middle), and surprisal (bottom) over time, with the input sequence overlaid on the predictive distribution (top, in black). The input dimension (feature), distributional choice in the model, and audio event annotation are indicated above. Audio clips can be found in Supplemental Information.

References

    1. Friston Karl J. A theory of cortical responses. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836, 4 2005. doi: 10.1098/rstb.2005.1622. URL http://rstb.royalsocietypublishing.org/content/360/1456/815.abstract. - DOI - PMC - PubMed
    1. Seriés Peggy and Seitz Aaron R. Learning what to expect (in visual perception), 2013. ISSN 16625161. - PMC - PubMed
    1. Heilbron Micha and Chait Maria. Great Expectations: Is there Evidence for Predictive Coding in Auditory Cortex? Neuroscience, 389:54–73, 2018. ISSN 18737544. doi: 10.1016/j.neuroscience.2017.07.061.URL 10.1016/j.neuroscience.2017.07.061. - DOI - DOI - PubMed
    1. Denham Susan L. and Winkler István. Predictive coding in auditory perception: challenges and unresolved questions, 2020. ISSN 14609568. - PubMed
    1. Clark Andy. Are we predictive engines? Perils, prospects, and the puzzle of the porous perceiver, 2013. ISSN 14691825. - PubMed

Publication types

LinkOut - more resources