Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline

Alejandrina Cristia¹, Lucas Gautheron^{2

3}, Zixing Zhang⁴, Björn Schuller^{5

6}, Camila Scaff^{2

7}, Caroline Rowland⁸, Okko Räsänen⁹, Loann Peurey², Marvin Lavechin², William Havard^{2

10}, Caitlin M Fausey¹¹, Margaret Cychosz¹², Elika Bergelson¹³, Heather Anderson¹¹, Najla Al Futaisi⁶, Melanie Soderstrom¹⁴

Affiliations

¹ Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France. alecristia@gmail.com.
² Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France.
³ Interdisciplinary Centre for Science and Technology Studies (IZWT) Wuppertal, University of Wuppertal, Nordrhein-Westfalen, Germany.
⁴ School of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China.
⁵ Technische Universität München, Institute for Human-Machine Communication, Munich, Germany.
⁶ Imperial College London, GLAM - Group on Language, Audio, & Music, London, UK.
⁷ Human Ecology group, Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland.
⁸ Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
⁹ Unit of Computing Sciences, Tampere University, Tampere, Finland.
¹⁰ LLL, Université d'Orléans, CNRS, Orléans, France.
¹¹ Psychology, University of Oregon, Eugene, OR, USA.
¹² Department of Hearing and Speech Sciences, University of Maryland at College Park, College Park, MD, USA.
¹³ Harvard University, Psychology, Cambridge, MA, USA.
¹⁴ University of Manitoba, Psychology, Winnipeg, MB, Canada.

PMID: 39304601
DOI: 10.3758/s13428-024-02493-2

Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline

Alejandrina Cristia et al. Behav Res Methods. 2024 Dec.

. 2024 Dec;56(8):8588-8607.

doi: 10.3758/s13428-024-02493-2. Epub 2024 Sep 20.

Authors

Affiliations

¹ Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France. alecristia@gmail.com.
² Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France.
³ Interdisciplinary Centre for Science and Technology Studies (IZWT) Wuppertal, University of Wuppertal, Nordrhein-Westfalen, Germany.
⁴ School of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China.
⁵ Technische Universität München, Institute for Human-Machine Communication, Munich, Germany.
⁶ Imperial College London, GLAM - Group on Language, Audio, & Music, London, UK.
⁷ Human Ecology group, Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland.
⁸ Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
⁹ Unit of Computing Sciences, Tampere University, Tampere, Finland.
¹⁰ LLL, Université d'Orléans, CNRS, Orléans, France.
¹¹ Psychology, University of Oregon, Eugene, OR, USA.
¹² Department of Hearing and Speech Sciences, University of Maryland at College Park, College Park, MD, USA.
¹³ Harvard University, Psychology, Cambridge, MA, USA.
¹⁴ University of Manitoba, Psychology, Winnipeg, MB, Canada.

PMID: 39304601
DOI: 10.3758/s13428-024-02493-2

Abstract

Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children's language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-/Spanish-, and Quechua-/Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity [Child ICC], was < 50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.

Keywords: Accuracy; Big data; Daylong recordings; Speech technology.

PubMed Disclaimer

References

1. Al Futaisi, N., Zhang, Z., Cristia, A., Warlaumont, A., & Schuller, B. (2019). VCMNet: Weakly supervised learning for automatic infant vocalisation maturity analysis. International Conference on Multimodal Interaction, 2019, 205–209. https://doi.org/10.1145/3340555.3353751 - DOI
1. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1. https://doi.org/10.18637/jss.v067.i01 - DOI
1. Bergelson, E. (2017). Bergelson seedlings homebank. Corpus. https://doi.org/10.21415/T5PK6D - DOI
1. Bergelson, E., Amatuni, A., Dailey, S., Koorathota, S., & Tor, S. (2019). Day by day, hour by hour: Naturalistic language input to infants. Developmental Science, 22(1), e12715. https://doi.org/10.1111/desc.12715 - DOI - PubMed
1. Bergelson, E., Soderstrom, M., Schwarz, I.-C., Rowland, C. F., Ramírez-Esparza, N., Hamrick, L. R., Marklund, E., Kalashnikova, M., Guez, A., Casillas, M., Benetti, L., van Alphen, P., & Cristia, A. (2023). Everyday language input and production in 1,001 children from six continents. Proceedings of the National Academy of Sciences, 120(52), e2300671120. https://doi.org/10.1073/pnas.2300671120 - DOI

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Springer
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline

Affiliations

Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline

Authors

Affiliations

Abstract

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous