Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 16;42(11):2313-2326.
doi: 10.1523/JNEUROSCI.2965-20.2022. Epub 2022 Jan 27.

Left Motor δ Oscillations Reflect Asynchrony Detection in Multisensory Speech Perception

Affiliations

Left Motor δ Oscillations Reflect Asynchrony Detection in Multisensory Speech Perception

Emmanuel Biau et al. J Neurosci. .

Abstract

During multisensory speech perception, slow δ oscillations (∼1-3 Hz) in the listener's brain synchronize with the speech signal, likely engaging in speech signal decomposition. Notable fluctuations in the speech amplitude envelope, resounding speaker prosody, temporally align with articulatory and body gestures and both provide complementary sensations that temporally structure speech. Further, δ oscillations in the left motor cortex seem to align with speech and musical beats, suggesting their possible role in the temporal structuring of (quasi)-rhythmic stimulation. We extended the role of δ oscillations to audiovisual asynchrony detection as a test case of the temporal analysis of multisensory prosody fluctuations in speech. We recorded Electroencephalograph (EEG) responses in an audiovisual asynchrony detection task while participants watched videos of a speaker. We filtered the speech signal to remove verbal content and examined how visual and auditory prosodic features temporally (mis-)align. Results confirm (1) that participants accurately detected audiovisual asynchrony, and (2) increased δ power in the left motor cortex in response to audiovisual asynchrony. The difference of δ power between asynchronous and synchronous conditions predicted behavioral performance, and (3) decreased δ-β coupling in the left motor cortex when listeners could not accurately map visual and auditory prosodies. Finally, both behavioral and neurophysiological evidence was altered when a speaker's face was degraded by a visual mask. Together, these findings suggest that motor δ oscillations support asynchrony detection of multisensory prosodic fluctuation in speech.SIGNIFICANCE STATEMENT Speech perception is facilitated by regular prosodic fluctuations that temporally structure the auditory signal. Auditory speech processing involves the left motor cortex and associated δ oscillations. However, visual prosody (i.e., a speaker's body movements) complements auditory prosody, and it is unclear how the brain temporally analyses different prosodic features in multisensory speech perception. We combined an audiovisual asynchrony detection task with electroencephalographic (EEG) recordings to investigate how δ oscillations support the temporal analysis of multisensory speech. Results confirmed that asynchrony detection of visual and auditory prosodies leads to increased δ power in left motor cortex and correlates with performance. We conclude that δ oscillations are invoked in an effort to resolve denoted temporal asynchrony in multisensory speech perception.

Keywords: audio-visual asynchrony; motor cortex; multisensory speech; prosody; δ oscillations.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Experimental procedure of the audio-visual asynchrony detection task. (A) The four experimental conditions. For each item, the audio signal was the same across all four versions. Visual information was manipulated by the presence or absence of a mask (no-mask or head-mask). Video and sound were either temporally aligned in the synchronous conditions (NMS, HMS), or temporally misaligned by 400 ms in the asynchronous conditions (NMA, HMA). (B) Example of one trial timeline. (C) Distribution of the electrodes covering the motor region of interest (ROI; blue circles) and the control region of non-interest in the visual area (RONI; red circles). (D) Examples of audio-stimuli presented in the no-mask (synchronous NMS and asynchronous NMA; upper picture) and head-mask conditions (synchronous HMS and asynchronous HMA; bottom picture).
Figure 2.
Figure 2.
Behavioral performances in the asynchrony detection task. A, Average d' scores and correct response rates (±standard error of the mean (SEM); gray dots represent individual averages; n = 23). B, Reaction times of correct responses across conditions (±SEM; gray dots represent individual averages). Significant contrasts are marked by asterisks (p < 0.05).
Figure 3.
Figure 3.
δ Responses to audiovisual asynchrony at the scalp level. A, Time-frequency spectra of the mean power differences in the motor ROI between asynchronous and synchronous conditions in the no-mask (NMA-NMS; left) and head-mask (HMA-HMS; right) contrasts. The white dashed lines correspond to the onset of the video and the window of interest is marked by the pink dashed rectangles. B, Topographical distribution of the difference of 2- to 3-Hz δ power in the time-window of interest, in the no-mask (NMA-NMS; top) and head-mask (HMA-HMS; bottom) contrasts. The pink dots display electrodes with significant t values (α threshold = 0.05). C, δ Power across the electrodes of interest in the four conditions (2- to 3-Hz band). Significant contrasts are marked by asterisks (p < 0.05).
Figure 4.
Figure 4.
Comparisons between the motor ROI and the visual RONI. A, TFRs of the difference of spectrum in the no-mask contrast (NMA-NMS) in the ROI and RONI. B, The mean differences of 2- to 3-Hz δ power (NMA-NMS and HMA-HMS) were computed in the ROI and RONI. Significant contrasts are marked by asterisks (p < 0.05).
Figure 5.
Figure 5.
δ Oscillation responses to audiovisual asynchrony at the source level for no-mask and head-mask contrasts. A, Contrast NMA–NMS projected onto the brain's surface (significance t values; cluster-corrected at α threshold = 0.05). The maximum voxel MNI coordinates is located left precentrally [−50 19 40], but significant activation was also found in the left inferior frontal gyrus (pars triangularis; maximum voxel MNI coordinates [−30 31 0]). No significant difference was found when the head of the speaker was masked (HMA–HMS contrast; not represented). B, Scatterplots of audiovisual asynchrony detection performance and δ power in the significant cluster region (left motor cortex). The difference of δ power in the left motor cluster (ΔPower; x-axis; z scores) correlated with the difference of audiovisual asynchrony detection (ΔCR; y-axis; z scores) between asynchronous and synchronous conditions only when the face of the speaker was visible, and participants could integrate video and audio onsets (no-mask conditions). C, Average δ power differences between correct and incorrect trials from the significant left motor cluster in the four conditions NMS, NMA, HMS, and HMA (±SEM; gray dots represent individual averages; n = 23; outliers not represented). Significant differences from zero are marked by asterisks (p < 0.05). D, left panel, Peak frequency correspondence between δ activity carried in the video clips and δ power responses induced in the left motor cluster. The bars represent the mean absolute distance between the δ peak frequencies in the stimulus and the peak frequencies of neural δ power induced during the corresponding trial (±SEM). Peak frequency matching was assessed for the synchronous and asynchronous conditions in the two mask conditions (no-mask and head-mask), and the different signal types of each stimulus: full, head only, body only, and audio only (see Table 1). The mean of the absolute difference scores were significantly greater than zero in all conditions and for all the stimulus signals. D, right panel, Consistency of δ peaks across the ordered stimuli in all four conditions. The upper panel displays the mean δ peak frequencies in the left motor cortex across all participants (±SEM) for each stimulus in the no-mask conditions (black squares: NMS; orange squares: NMA). The lower panel displays the mean EEG δ peak frequencies across all participants (±SEM) for each stimulus in the head-mask conditions (black squares: HMS; orange squares: HMA). The variations in δ responses observed across all conditions reflect a difference of power amplitude modulation on the same oscillatory activity.
Figure 6.
Figure 6.
PAC between δ and β oscillations. A, PAC analysis in the left motor cluster. The figure represents the modulation of δ-β PAC in a significant cluster, dependent on the mask and audiovisual asynchrony. Significance is indicated by an asterisk (p < 0.05, Bonferroni-corrected). δ-β PAC from the left motor cortex was greater in the no-mask than the head-mask conditions but did not discriminate between correct and incorrect trials. Significant contrasts are marked by asterisks (p < 0.05). B, δ-β PAC difference between no-mask (NMA+ NMS) and head-mask (HMA + HMS) case in the whole brain. Results revealed significant maximum differences located in the superior motor area (MNI coordinates [0 11 50]) and in the left middle temporal lobe (MNI coordinates [−50 −1 −20]).

References

    1. Arnal LH (2012) Predicting “when” using the motor system's beta-band oscillations. Front Hum Neurosci 6:225. 10.3389/fnhum.2012.00225 - DOI - PMC - PubMed
    1. Arnal LH, Doelling KB, Poeppel D (2015) Delta-beta coupled oscillations underlie temporal prediction accuracy. Cereb Cortex 25:3077–3085. 10.1093/cercor/bhu103 - DOI - PMC - PubMed
    1. Biau E, Soto-Faraco S (2013) Beat gestures modulate auditory integration in speech perception. Brain Lang 124:143–152. 10.1016/j.bandl.2012.10.008 - DOI - PubMed
    1. Biau E, Morís Fernández L, Holle H, Avila C, Soto-Faraco S (2016) Hand gestures as visual prosody: BOLD responses to audio-visual alignment are modulated by the communicative nature of the stimuli. Neuroimage 132:129–137. 10.1016/j.neuroimage.2016.02.018 - DOI - PubMed
    1. Boersma P, Weenink D (2015) Praat: doing phonetics by computer (version 5.4.17). Retrieved from https://www.praat.org.

Publication types

LinkOut - more resources