Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Dec 11:15:1474961.
doi: 10.3389/fpsyg.2024.1474961. eCollection 2024.

Developmental origins of natural sound perception

Affiliations
Review

Developmental origins of natural sound perception

Silvia Polver et al. Front Psychol. .

Abstract

Infants are exposed to a myriad of sounds early in life, including caregivers' speech, songs, human-made and natural (non-anthropogenic) environmental sounds. While decades of research have established that infants have sophisticated perceptual abilities to process speech, less is known about how they perceive natural environmental sounds. This review synthesizes current findings about the perception of natural environmental sounds in the first years of life, emphasizing their role in auditory development and describing how these studies contribute to the emerging field of human auditory ecology. Some of the existing studies explore infants' responses to animal vocalizations and water sounds. Infants demonstrate an initial broad sensitivity to primate vocalizations, which narrows to human speech through experience. They also show early recognition of water sounds, with preferences for natural over artificial water sounds already at birth, indicating an evolutionary ancient sensitivity. However, this ability undergoes refinement with age and experience. The few studies available suggest that infants' auditory processing of natural sounds is complex and influenced by both genetic predispositions and exposure. Building on these existing results, this review highlights the need for ecologically valid experimental paradigms that better represent the natural auditory environments humans evolved in. Understanding how children process natural soundscapes not only deepens our understanding of auditory development but also offers practical insights for advancing environmental awareness, improving auditory interventions for children with hearing loss, and promoting wellbeing through exposure to natural sounds.

Keywords: animal vocalizations; auditory development; children; environmental sounds; human auditory ecology; infants; natural soundscapes; water sounds.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Modulation power spectra (MPS) of natural versus urban soundscapes. MPS shows how modulation power varies as a function of spectral-modulation (ordinate) and temporal-modulation (abscissa) rate (see Singh and Theunissen, for more information about MPS analysis). These representations highlight the spectral and temporal structure in the spectrogram of sounds (Theunissen and Elie, 2014). MPS were computed on single acoustic recordings conducted in closed and open terrestrial natural habitats (boreal, temperate and tropical forests, a savannah and a desert), and in two typical indoor and outdoor urban settings (street traffic and crowd). Each MPS is normalized by its own maximum modulation power. Sources: B. Krause, Wild Sanctuary (natural soundscapes); S. Meunier, LMA, CNRS, and royalty free sound library SoundBible (urban soundscapes). See Supplementary Appendix for additional information about the stimuli.
Figure 2
Figure 2
Modulation power spectra (MPS) of natural sounds (bird vocalizations, primate vocalization, insect stridulation and water sounds) and speech sounds. Natural sounds: (i) Bird songs: single recordings from eight bird species selected from a protected European cold forest in the East of France (the Risoux forest in France). Sources: J.-C. Roché & MNHN; (ii) Primate vocalization: single recording of a baboon “wahoo” vocalization. Source: Gemignani and Gervain (2024); (iii) Insect stridulation: single recording of Tettigonia viridissima, the great green bush-cricket inhabiting the Risoux forest. Source: J. Sueur, MNHN; (iv) Water sounds: single recordings of a single headwater forest stream with distinct water temperatures and discharge rates (here, a slow versus a fast discharge). Source: Klaus et al. (2019). Speech sounds: single sentences recorded in ten different languages from a female speaker: Basque, Dutch, English, French, Japanese, Marathi, Polish, Spanish, Turkish, and Zulu (1 recording per language). Source: Ramus et al. (1999). Each MPS is normalized by its own maximum modulation power. See Supplementary Appendix for additional information about the stimuli.
Figure 3
Figure 3
The generative model of water sounds used in Geffen et al. (2011) and Gervain et al. (2014). The model generates sounds using a population of gammatone chirps, each defined by its frequency, amplitude and cycle constant of decay. These parameters can be set such that the chirps are (i) scale invariant (upper inset), i.e. the cycle constant of decay is fixed and therefore the shape of the chirp is constant, frequency is inversely proportional to duration, or (ii) variable scale (lower inset), i.e. duration is held constant, and independent of frequency, therefore the shape of the chirp changes.
Figure 4
Figure 4
Summary statistics computed by a model of auditory texture perception (McWalter and Dau, 2017) in response to the scale-invariant (“natural,” top panels) and scale-variable (“not-natural,” bottom panels) synthetic water sounds used by Geffen et al. (2011) and Gervain et al. (2014)). Summary statistics mean, variance, skew and kurtosis (ordinate) are shown as a function of cochlear channel (abscissa). Cross-band correlations (right-most panels) are shown for each pair of cochlear channels (the hue value from green to yellow covers the 0–1 range of cross-band correlations). The two synthesized sounds differ substantially in terms of their excitation pattern (the internal power spectrum of sounds) and sparsity in each cochlear channel, with scale-invariant (natural) sounds being sparser than scale-variable (not-natural) sounds. The coordination of the temporal envelopes at the output of the cochlear channels also appears to be somewhat different between the two sounds. See Supplementary Appendix for additional information about the computational auditory model.
Figure 5
Figure 5
Summary statistics computed by a model of auditory texture perception (McWalter and Dau, 2017) in response to the hot (top panels) and cold (bottom panels) water sounds used by Agrawal and Schachner (2023). The two sounds show comparable power spectra and envelope sparsity in each cochlear channel. However, the coordination of the temporal envelopes at the output of the cochlear channels appears larger for the sound of hot water. See Supplementary Appendix for additional information about the stimuli and computational auditory model.

References

    1. Agrawal T., Schachner A. (2023). Hearing water temperature: characterizing the development of nuanced perception of sound sources. Dev. Sci. 26:e13321. 10.1111/desc.13321 - DOI - PubMed
    1. Altmann C. F., Doehrmann O., Kaiser J. (2007). Selectivity for animal vocalizations in the human auditory cortex. Cereb. Cortex 17, 2601–2608. 10.1093/cercor/bhl167 - DOI - PubMed
    1. Apoux F., Miller-Viacava N., Ferrière R., Dai H., Krause B., Sueur J., et al. . (2023). Auditory discrimination of natural soundscapes. J. Acoust. Soc. Am. 153, 2706–2706. 10.1121/10.0017972 - DOI - PubMed
    1. Belin P. (2006). Voice processing in human and non-human primates. Philos. Trans. R. Soc. B Biol. Sci. 361, 2091–2107. 10.1098/rstb.2006.1933 - DOI - PMC - PubMed
    1. Cassarino M., Setti A. (2016). Complexity as key to designing cognitive-friendly environments for older people. Front. Psychol. 7:1329. 10.3389/fpsyg.2016.01329 - DOI - PMC - PubMed

LinkOut - more resources