Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 17;66(8S):3132-3150.
doi: 10.1044/2023_JSLHR-22-00263. Epub 2023 Apr 18.

Speech Entrainment in Adolescent Conversations: A Developmental Perspective

Affiliations

Speech Entrainment in Adolescent Conversations: A Developmental Perspective

Camille J Wynn et al. J Speech Lang Hear Res. .

Abstract

Purpose: Defined as the similarity of speech behaviors between interlocutors, speech entrainment plays an important role in successful adult conversations. According to theoretical models of entrainment and research on motoric, cognitive, and social developmental milestones, the ability to entrain should develop throughout adolescence. However, little is known about the specific developmental trajectory or the role of speech entrainment in conversational outcomes of this age group. The purpose of this study is to characterize speech entrainment patterns in the conversations of neurotypical early adolescents.

Method: This study utilized a corpus of 96 task-based conversations between adolescents between the ages of 9 and 14 years and a comparison corpus of 32 task-based conversations between adults. For each conversational turn, two speech entrainment scores were calculated for 429 acoustic features across rhythmic, articulatory, and phonatory dimensions. Predictive modeling was used to evaluate the degree of entrainment and relationship between entrainment and two metrics of conversational success.

Results: Speech entrainment increased throughout early adolescence but did not reach the level exhibited in conversations between adults. Additionally, speech entrainment was predictive of both conversational quality and conversational efficiency. Furthermore, models that included all acoustic features and both entrainment types performed better than models that only included individual acoustic feature sets or one type of entrainment.

Conclusions: Our findings show that speech entrainment skills are largely developed during early adolescence with continued development possibly occurring across later adolescence. Additionally, results highlight the role of speech entrainment in successful conversation in this population, suggesting the import of continued exploration of this phenomenon in both neurotypical and neurodivergent adolescents. We also provide evidence of the value of using holistic measures that capture the multidimensionality of speech entrainment and provide a validated methodology for investigating entrainment across multiple acoustic features and entrainment types.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Visual representation of some of the components necessary for entrainment to occur.
Figure 2.
Figure 2.
Overview of methodological process for this study. Spoken dialogs are divided into individual speaking turns. Moreover, 429 acoustic features (divided into five acoustic feature sets) are extracted from each speaking turn in every conversation. Proximity and synchrony scores are calculated for each acoustic feature, yielding 858 entrainment scores per speaking turn. Predictive modeling is used to evaluate the degree of entrainment (i.e., degree to which entrainment scores could be used to distinguish real and sham conversational turns) and the relationship between entrainment and conversational success (i.e., degree to which entrainment scores could be used to predict conversational efficiency and quality scores). EMS = envelope modulation spectrum; LTAS = long-term average spectrum; MFCC = mel-frequency cepstrum coefficient; VR = voice report.
Figure 3.
Figure 3.
Schematic illustrating two types of entrainment evaluated within this study. Proximity represents similarity in the speech features between two interlocutors. Synchrony represents similarity in the movement of speech features between two interlocutors.
Figure 4.
Figure 4.
Predictive accuracy of entrainment models by age group. Here, error bars represent standard error. The solid line represents the trajectory of entrainment development as represented by data analyzed within the study. Although no data were collected for a late adolescence group, the dotted line represents a possible continued trajectory for entrainment development across this time period.
Figure 5.
Figure 5.
Predictive accuracy of entrainment models by acoustic feature set. Full represents models containing all acoustic feature sets EMS = envelope modulation spectrum; LTAS = long-term average spectrum; MFCC = mel-frequency cepstrum coefficient; VR = voice report.
Figure 6.
Figure 6.
Predictive accuracy of entrainment models by entrainment type. Full represents models containing both entrainment types.
Figure 7.
Figure 7.
Comparison of actual conversational quality and conversational efficiency scores for each participant and model predicted scores based on entrainment data.
Figure 8.
Figure 8.
Predictive accuracy for conversational success by acoustic feature set. Full represents models containing all acoustic feature sets. EMS = envelope modulation spectrum; LTAS = long-term average spectrum; MFCC = mel-frequency cepstrum coefficient; VR = voice report.
Figure 9.
Figure 9.
Predictive accuracy for conversational success by entrainment type. Full represents models containing both entrainment types.

References

    1. Aguilar, L. J. , Downey, G. , Krauss, R. M. , Pardo, J. S. , Lane, S. , & Bolger, N. (2016). A dyadic perspective on speech accommodation and social connection: Both partners' rejection sensitivity matters. Journal of Personality, 84(2), 165–177. 10.1111/jopy.12149 - DOI - PMC - PubMed
    1. Baker, R. , & Hazan, V. (2011). DiapixUK: Task materials for the elicitation of multiple spontaneous speech dialogs. Behavior Research Methods, 43(3), 761–770. 10.3758/s13428-011-0075-y - DOI - PubMed
    1. Balaam, M. , Fitzpatrick, G. , Good, J. , & Harris, E. (2011). Enhancing interactional synchrony with an ambient display. In CHI '11: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 867–876). Association for Computing Machinery. 10.1145/1978942.1979070 - DOI
    1. Bernieri, F. J. , Davis, J. M. , Rosenthal, R. , & Knee, C. R. (1994). Interactional synchrony and rapport: Measuring synchrony in displays devoid of sound and facial affect. Personality and Social Psychology Bulletin, 20(3), 303–311. 10.1177/0146167294203008 - DOI
    1. Boersma, P. (1993). Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the Institute of Phonetic Sciences (Vol. 17, No. 1193, pp. 97–110), Amsterdam, the Netherlands. Institute of Phonetic Sciences, University of Amsterdam.

Publication types

LinkOut - more resources