Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr;53(2):846-863.
doi: 10.3758/s13428-020-01464-7.

A psycholinguistic method for measuring coarticulation in child and adult speech

Affiliations

A psycholinguistic method for measuring coarticulation in child and adult speech

Phil J Howson et al. Behav Res Methods. 2021 Apr.

Abstract

A perceiver's ability to accurately predict target sounds in a forward-gated AV speech task indexes the strength and scope of anticipatory coarticulation in adult speech (Redford et al., JASA, 144, 2447-2461, 2018). This suggests a perception-based method for studying coarticulation in populations who may poorly tolerate the more invasive or restrictive techniques used to measure speech movements directly. But the use of perception to measure production begs the question of confounding influences on perceiver performance and thus on the reliability and generalizability of the proposed method. The present study was therefore designed to test whether a gated AV speech method for measuring coarticulation provides reliable results across different study populations (child versus adult), different task environments (in-lab versus online), and different coarticulatory directions (forward/anticipatory versus backward/carryover). The results indicated excellent measurement reliability across age groups in the forward/anticipatory measurement direction, though more perceivers are needed to achieve the same levels of agreement and consistency when the task is completed online. Accuracy was lower in the backward/carryover direction, and although agreement and consistency were still reasonably high across perceivers, the effect of age group differed between the laboratory and online environments, suggesting measurement error in one or both environments. Overall, the results support using in-lab or online perceptual judgments to measure anticipatory coarticulation in developmental studies of speech production. Further validation study is needed before the method can be extended to measure carryover coarticulation.

Keywords: AV speech perception; Gating paradigm; Online perception task; Speech production.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The final frame from each of six stimuli derived from a single production of the sentence “Maddy packs the goat” by a 5-year-old speaker. Gates were placed at the segmental landmarks shown, beginning with the verb, using acoustic and kinematic cues in the audiovisual recording.
Figure 2.
Figure 2.
SS-ANOVA plots for the schwa in “the” produced by the 6 adult (left) and 6 child (right) speakers as a function of the following noun: formants for schwa before “gak” are shown in black; those for schwa before “goat” are shown in grey. Dotted lines indicate 95% confidence intervals.
Figure 3.
Figure 3.
Perceiver performance in the forward prediction task is shown as a function of the speakers’ age (i.e., group: adult versus child) and sentence gate at the segmental landmarks indicated, where “æ” indicates the midpoint of the stressed vowel in “packs” and “T” the midpoint of the target vowel (/æ/ or /oʊ/) in the object noun. Whiskers indicate the 95% confidence intervals and the dotted line indicates chance performance. The data are from perceivers who completed the task in a laboratory setting.
Figure 4.
Figure 4.
Perceiver performance in the forward prediction task is shown as a function of the speakers’ age (i.e., group: adult versus child) and sentence gate at the segmental landmarks indicated, where “æ” indicates the midpoint of the stressed vowel in “packs” and “T” the midpoint of the target vowel (/æ/ or /oʊ/) in the object noun. Whiskers indicate the 95% confidence intervals and the dotted line indicates chance performance. The data are from perceivers who completed the task online.
Figure 5.
Figure 5.
ICC values with 95% confidence intervals for perceiver agreement and consistency in the prediction task when listening to forward-gated AV speech produced by adults and children in the laboratory versus online condition.
Figure 6.
Figure 6.
The initial frame from each of six stimuli derived from a single production of the sentence “Maddy packs the goat” by a 5-year-old speaker. Gates were placed at the segmental landmarks shown, beginning with the verb, using acoustic and kinematic cues in the audiovisual recording.
Figure 7.
Figure 7.
SS-ANOVA plots for the schwa in “the” produced by the 6 adult (left) and 6 child (right) speakers as a function of the preceding verb: formants for schwa after “packs” are shown in black; those for schwa after “pokes” are shown in grey. Dotted lines indicate 95% confidence intervals.
Figure 8.
Figure 8.
Perceiver performance in the backward prediction task is shown as a function of the speakers’ age (i.e., group: adult versus child) and sentence gate at the segmental landmarks indicated, where “T” indicates the midpoint of the target vowel (/æ/ or /oʊ/) in the verb and “æ” indicates the midpoint of the stressed vowel in the noun “gak.” Whiskers indicate the 95% confidence intervals and the dotted line indicates chance performance. The data are from perceivers who completed the task in the laboratory.
Figure 9.
Figure 9.
Perceiver performance in the backward prediction task is shown as a function of the speakers’ age (i.e., group: adult versus child) and sentence gate at the segmental landmarks indicated, where indicates the “T” midpoint of the target vowel (/æ/ or /oʊ/) in the verb and “æ” indicates the midpoint of the stressed vowel in the noun “gak.” Whiskers indicate the 95% confidence intervals and the dotted line indicates chance performance. The data are from perceivers who completed the task online.
Figure 10.
Figure 10.
ICC values with 95% confidence intervals for perceiver agreement and consistency in the prediction task when listening to backward-gated AV speech produced by adults and children in the laboratory versus online condition.

References

    1. Arditte KA, Çek D, Shaw AM, & Timpano KR (2016). The importance of assessing clinical phenomena in Mechanical Turk research, Psychological Assessment, 28, 684–691. - PubMed
    1. Bates D & Mächler M, Bolker B, & Walker S (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48.
    1. Baum SR, & McFarland DH (2000). Individual differences in speech adaptation to an artificial palate. Journal of the Acoustical Society of America, 107(6), 3572–3575. - PubMed
    1. Beddor P, Harnsberger J, and Lindemann S (2002). Language- specific patterns of vowel-to-vowel coarticulation: Acoustic structures and their perceptual correlates, Journal of Phonetics, 30, 591–627.
    1. Campbell R (2008). The processing of audio-visual speech: empirical and neural bases. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1493), 1001–1010. - PMC - PubMed

Publication types

LinkOut - more resources