Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 16:13:914239.
doi: 10.3389/fpsyg.2022.914239. eCollection 2022.

Rational speech comprehension: Interaction between predictability, acoustic signal, and noise

Affiliations

Rational speech comprehension: Interaction between predictability, acoustic signal, and noise

Marjolein Van Os et al. Front Psychol. .

Abstract

Introduction: During speech comprehension, multiple sources of information are available to listeners, which are combined to guide the recognition process. Models of speech comprehension posit that when the acoustic speech signal is obscured, listeners rely more on information from other sources. However, these models take into account only word frequency information and local contexts (surrounding syllables), but not sentence-level information. To date, empirical studies investigating predictability effects in noise did not carefully control the tested speech sounds, while the literature investigating the effect of background noise on the recognition of speech sounds does not manipulate sentence predictability. Additionally, studies on the effect of background noise show conflicting results regarding which noise type affects speech comprehension most. We address this in the present experiment.

Methods: We investigate how listeners combine information from different sources when listening to sentences embedded in background noise. We manipulate top-down predictability, type of noise, and characteristics of the acoustic signal, thus creating conditions which differ in the extent to which a specific speech sound is masked in a way that is grounded in prior work on the confusability of speech sounds in noise. Participants complete an online word recognition experiment.

Results and discussion: The results show that participants rely more on the provided sentence context when the acoustic signal is harder to process. This is the case even when interactions of the background noise and speech sounds lead to small differences in intelligibility. Listeners probabilistically combine top-down predictions based on context with noisy bottom-up information from the acoustic signal, leading to a trade-off between the different types of information that is dependent on the combination of a specific type of background noise and speech sound.

Keywords: background noise; mishearing; noisy channel; predictive context; rational processing; speech comprehension.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The different stages of the experiment, with a single trial between brackets. Participants completed 4 practice trials and 90 experimental trials (half of 180 to keep the experiment at a manageable length).
Figure 2
Figure 2
The proportion of participants’ responses (wrong, target, and distractor) for each of the three noise conditions (quiet, babble, white noise) and three sound types (plosives, fricatives, vowels) for the high predictability condition.
Figure 3
Figure 3
The proportion of participants’ responses (wrong, target, and distractor) for each of the three noise conditions (quiet, babble, white noise) and three sound types (plosives, fricatives, vowels) for the low predictability condition.
Figure 4
Figure 4
The wrong responses that semantically fit or did not fit the sentence, plotted with the normalized phonetic distance, in each of the three noise conditions. Lower phonetic distance means more similar to the target item. The vertical black lines show the mean phonetic distance for each condition. Each dot represents a single wrong response, the shaded curves show the density plots for these.

References

    1. Altmann G. T., Kamide Y. (1999). Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition 73, 247–264. doi: 10.1016/S0010-0277(99)00059-1, PMID: - DOI - PubMed
    1. Alwan A., Jiang J., Chen W. (2011). Perception of place of articulation for plosives and fricatives in noise. Speech Commun. 53, 195–209. doi: 10.1016/j.specom.2010.09.001, PMID: - DOI - PMC - PubMed
    1. Aurnhammer C., Delogu F., Schulz M., Brouwer H., Crocker M. W. (2021). Retrieval (N400) and integration (P600) in expectation-based comprehension. PLoS One 16:e0257430. doi: 10.1371/journal.pone.0257430, PMID: - DOI - PMC - PubMed
    1. Ayasse N. D., Hodson A. J., Wingfield A. (2021). The principle of least effort and comprehension of spoken sentences by younger and older adults. Front. Psychol. 12, 1–13. doi: 10.3389/fpsyg.2021.629464, PMID: - DOI - PMC - PubMed
    1. Baayen R. H., Davidson D. J., Bates D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 59, 390–412. doi: 10.1016/j.jml.2007.12.005 - DOI

LinkOut - more resources