Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 19:13:879156.
doi: 10.3389/fpsyg.2022.879156. eCollection 2022.

Semantic Cues Modulate Children's and Adults' Processing of Audio-Visual Face Mask Speech

Affiliations

Semantic Cues Modulate Children's and Adults' Processing of Audio-Visual Face Mask Speech

Julia Schwarz et al. Front Psychol. .

Abstract

During the COVID-19 pandemic, questions have been raised about the impact of face masks on communication in classroom settings. However, it is unclear to what extent visual obstruction of the speaker's mouth or changes to the acoustic signal lead to speech processing difficulties, and whether these effects can be mitigated by semantic predictability, i.e., the availability of contextual information. The present study investigated the acoustic and visual effects of face masks on speech intelligibility and processing speed under varying semantic predictability. Twenty-six children (aged 8-12) and twenty-six adults performed an internet-based cued shadowing task, in which they had to repeat aloud the last word of sentences presented in audio-visual format. The results showed that children and adults made more mistakes and responded more slowly when listening to face mask speech compared to speech produced without a face mask. Adults were only significantly affected by face mask speech when both the acoustic and the visual signal were degraded. While acoustic mask effects were similar for children, removal of visual speech cues through the face mask affected children to a lesser degree. However, high semantic predictability reduced audio-visual mask effects, leading to full compensation of the acoustically degraded mask speech in the adult group. Even though children did not fully compensate for face mask speech with high semantic predictability, overall, they still profited from semantic cues in all conditions. Therefore, in classroom settings, strategies that increase contextual information such as building on students' prior knowledge, using keywords, and providing visual aids, are likely to help overcome any adverse face mask effects.

Keywords: audio-visual integration; bottom-up vs. top-down; cued shadowing; face masks; internet-based data collection; language development; semantic prediction; speech processing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The speaker with and without a face mask in neutral position.
Figure 2
Figure 2
Trial design for capturing reaction times (RT). X1: Duration of trial recording from the beep to the response onset. X2: Duration of stimulus from the beep to the end of the presented sentence.
Figure 3
Figure 3
Individual main effects on subjects’ mean response accuracy in % averaged by Age Group based on the raw data (from left to right: Acoustic Mask Effect, Visual Mask Effect, Cloze Probability Effect).
Figure 4
Figure 4
Individual main effects on mean reaction times in ms (with error bars) by Age Group based on the raw data (from left to right: Acoustic Mask Effect, Visual Mask Effect, Cloze Probability Effect).
Figure 5
Figure 5
Interaction effects on mean reaction times in ms (with error bars) by Age Group based on the raw data (on the left: Acoustic Mask*Visual Mask; on the right: Acoustic Mask*Cloze Probability). Interactions in the mixed-model analysis were significant only for the adult comparisons (solid lines).
Figure 6
Figure 6
Mean reaction times across the four experiment blocks by Age Group based on the raw data, comparing fully masked + Acoustic Mask, + Visual Mask and fully unmasked – Acoustic Mask, – Visual Mask conditions.

Similar articles

Cited by

References

    1. Anwyl-Irvine A. L., Massonnié J., Flitton A., Kirkham N., Evershed J. K. (2020). Gorilla in our midst: an online behavioral experiment builder. Behav. Res. 52, 388–407. doi: 10.3758/s13428-019-01237-x, PMID: - DOI - PMC - PubMed
    1. Arnold P., Hill F. (2001). Bisensory augmentation: a speechreading advantage when speech is clearly audible and intact. Br. J. Psychol. 92, 339–355. doi: 10.1348/000712601162220, PMID: - DOI - PubMed
    1. Atchley R. A., Rice M. L., Betz S. K., Kwasny K. M., Sereno J. A., Jongman A. (2006). A comparison of semantic and syntactic event related potentials generated by children and adults. Brain Lang. 99, 236–246. doi: 10.1016/j.bandl.2005.08.005, PMID: - DOI - PubMed
    1. Aydelott J., Dick F., Mills D. L. (2006). Effects of acoustic distortion and semantic context on event-related potentials to spoken words. Psychophysiology 43, 454–464. doi: 10.1111/j.1469-8986.2006.00448.x, PMID: - DOI - PubMed
    1. Barenholtz E., Mavica L., Lewkowicz D. J. (2016). Language familiarity modulates relative attention to the eyes and mouth of a talker. Cognition 147, 100–105. doi: 10.1016/j.cognition.2015.11.013, PMID: - DOI - PMC - PubMed

LinkOut - more resources