Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Sep 26:2:238.
doi: 10.3389/fpsyg.2011.00238. eCollection 2011.

An assessment of behavioral dynamic information processing measures in audiovisual speech perception

Affiliations

An assessment of behavioral dynamic information processing measures in audiovisual speech perception

Nicholas Altieri et al. Front Psychol. .

Abstract

Research has shown that visual speech perception can assist accuracy in identification of spoken words. However, little is known about the dynamics of the processing mechanisms involved in audiovisual integration. In particular, architecture and capacity, measured using response time methodologies, have not been investigated. An issue related to architecture concerns whether the auditory and visual sources of the speech signal are integrated "early" or "late." We propose that "early" integration most naturally corresponds to coactive processing whereas "late" integration corresponds to separate decisions parallel processing. We implemented the double factorial paradigm in two studies. First, we carried out a pilot study using a two-alternative forced-choice discrimination task to assess architecture, decision rule, and provide a preliminary assessment of capacity (integration efficiency). Next, Experiment 1 was designed to specifically assess audiovisual integration efficiency in an ecologically valid way by including lower auditory S/N ratios and a larger response set size. Results from the pilot study support a separate decisions parallel, late integration model. Results from both studies showed that capacity was severely limited for high auditory signal-to-noise ratios. However, Experiment 1 demonstrated that capacity improved as the auditory signal became more degraded. This evidence strongly suggests that integration efficiency is vitally affected by the S/N ratio.

Keywords: capacity; coactive; multisensory integration; parallel; speech.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A diagram of a parallel model (top) with an OR and AND gate (See also Townsend and Nozawa, for a similar diagram). The coactive model below assumes that each channel is pooled into a common processor where evidence is accumulated prior to the decision stage. Lastly, the figure depicts a serial model, which assumes that processing does not begin on the second modality until it finishes processing on the first.
Figure 2
Figure 2
SIC(t) predictions for standard independent parallel, serial, and coactive models. The two top panels display the predictions of the independent parallel first-terminating and exhaustive models respectively, while the middle panels display the predictions of the serial first-terminating and exhaustive models respectively. The bottom panel displays the coactive model predictions. The SIC(t) is plotted against arbitrary time units (AU).
Figure 3
Figure 3
Predicted workload capacity, C(t), for independent parallel models (left), and coactive models (right). Notice that coactive model predicts extreme super capacity, while independent parallel models predict C(t) = 1 (which is the benchmark for efficient audiovisual processing or integration). Standard serial models (generally) predict C(t) = 1/2 while parallel models with negative cross-talk can readily mimic such predictions.
Figure 4
Figure 4
(A) SIC(t) function for one exemplary participant in the pilot study. This participant showed evidence for parallel first-terminating processing (as did three other participants). Only one participant produced an SIC(t) consistent with another model, which was coactive processing. (B) The capacity results computed for the same participant. C(t) was computed across all saliency levels (i.e., the integrated hazard functions from the AV, A-only, and V-only conditions included RTs from each level of saliency). Each participant yielded strong evidence for extremely limited capacity, a finding inconsistent with coactivation, but consistent with a parallel model with cross-channel inhibition.
Figure 5
Figure 5
The capacity coefficient C(t) for each participant across all three experimental conditions. The top (A), shows C(t) for five participants in the condition where the auditory S/N ratio was −18 dB. Each participant, except for Participant 3, evidenced super capacity (violating the bound C(t) = 1, or upper Miller Bound in capacity space; Eidels et al., 2011). The legend shows that C(t) is denoted by the dots, the upper bound by the solid curve, and the lower bound by the dashed line. (B) shows C(t) for five participants in the condition with an auditory S/N ratio of −12 dB, and the bottom (C) shows C(t) for five participants in the condition without any degradation of the auditory signal.

References

    1. Altieri N. (2010). Toward a Unified Theory of Audiovisual Integration in Speech Perception. Doctoral Dissertation, Indiana University, Bloomington, IN
    1. Altieri N., Wenger M. J. (2011). “Neural and information processing measures of audiovisual integration,” in Conference of the Vision Sciences Society, Poster Presentation, Naples, FL
    1. Arnold D. H., Tear M., Schindel R., Roseboom W. (2010). Audio-visual speech cue combination. PLoS ONE 5, e10217.10.1371/journal.pone.0010217 - DOI - PMC - PubMed
    1. Barutchu A., Crewther D. P., Crewther S. G. (2009). The race that precedes coactivation: development of multisensory facilitation in children. Dev. Sci. 12, 464–47310.1111/j.1467-7687.2008.00782.x - DOI - PubMed
    1. Barutchu A., Danaher J., Crewther S. G., Innes-Brown H., Shivdasani M. N., Paolini A. G. (2010). Audiovisual integration in noise by children and adults. J. Exp. Child. Psychol. 105, 38–5010.1016/j.jecp.2009.08.005 - DOI - PubMed

LinkOut - more resources