Demographic Biases in Naturalistic Language Recordings in the CHILDES Database
- PMID: 40135546
- DOI: 10.1111/desc.70011
Demographic Biases in Naturalistic Language Recordings in the CHILDES Database
Abstract
In recent years, the importance of estimating demographic biases in research has become apparent. Here, we provide a systematic review of the CHILDES database, the major source of naturalistic recordings of children's linguistic environment. We analyzed the database according to four dimensions considered central to language learning: SES, urbanization, family structure, and language. We present descriptive statistics of each dimension to assess whether naturalistic recordings were biased regarding the demographics of the countries and the families recorded within them. We find that CHILDES's recordings overrepresented wealthier countries and higher parental education levels, urban settings, and smaller households. Middle- and higher-class participants were likewise over-represented. The corpora were not representative of their countries in terms of urbanization either-with a larger percentage of families residing in urban settings than is overall true for their respective countries. In terms of family structure, nuclear families were more prevalent than in the countries where the data were collected. Last, we found that corpora were linguistically diverse, but we estimate that these recordings underrepresented bilingual and multilingual households. We conclude that researchers should be mindful when generalizing from naturalistic recordings of children's input and output obtained from CHILDES and make recommendations for the future use of CHILDES.
Keywords: CHILDES; Naturalistic recordings; Spontaneous speech; WEIRD; demographic biases; home recordings.
© 2025 John Wiley & Sons Ltd.
Similar articles
-
childes-db: A flexible and reproducible interface to the child language data exchange system.Behav Res Methods. 2019 Aug;51(4):1928-1941. doi: 10.3758/s13428-018-1176-7. Behav Res Methods. 2019. PMID: 30623390
-
Efficient Estimation of Children's Language Exposure in Two Bilingual Communities.J Speech Lang Hear Res. 2021 Oct 4;64(10):3843-3866. doi: 10.1044/2021_JSLHR-20-00755. Epub 2021 Sep 14. J Speech Lang Hear Res. 2021. PMID: 34520232 Free PMC article.
-
HomeBank: An Online Repository of Daylong Child-Centered Audio Recordings.Semin Speech Lang. 2016 May;37(2):128-42. doi: 10.1055/s-0036-1580745. Epub 2016 Apr 25. Semin Speech Lang. 2016. PMID: 27111272 Free PMC article.
-
Parental Input and Its Relationship With Language Outcomes in Children With (Suspected) Developmental Language Disorder: A Systematic Review.J Speech Lang Hear Res. 2025 Apr 8;68(4):1982-2005. doi: 10.1044/2024_JSLHR-24-00529. Epub 2025 Mar 12. J Speech Lang Hear Res. 2025. PMID: 40073433
-
Daylong egocentric recordings in small- and large-scale language communities: A practical introduction.Adv Child Dev Behav. 2024;66:29-53. doi: 10.1016/bs.acdb.2024.05.002. Epub 2024 Jun 6. Adv Child Dev Behav. 2024. PMID: 39074924 Review.
Cited by
-
On Convenience, Diversity, and Generalisability: A Commentary on Scaff et al. (2025).Dev Sci. 2025 Sep;28(5):e70050. doi: 10.1111/desc.70050. Dev Sci. 2025. PMID: 40676807 Free PMC article. No abstract available.
-
Sustaining Language Acquisition Research in Africa: A Commentary on Scaff et al. (2025).Dev Sci. 2025 Sep;28(5):e70063. doi: 10.1111/desc.70063. Dev Sci. 2025. PMID: 40827017 Free PMC article. No abstract available.
References
-
- Anderson, J. R., and L. J. Schooler. 1991. “Reflections of the Environment in Memory.” Psychological Science 2, no. 6: 396–408. https://doi.org/10.1111/j.1467‐9280.1991.tb00174.x.
-
- Arel‐Bundock, V. 2021. WDI: World Development Indicators and Other World Bank Data. R Package version, 2, no. 4.
-
- Aravena‐Bravo, P., A. Cristia, R. Garcia, et al. 2024. “Towards Diversifying Early Language Development Research: The First Truly Global International Summer/Winter School on Language Acquisition (/L+/) 2021.” Journal of Cognition and Development 25, no. 2: 242–260. https://doi.org/10.1080/15248372.2023.2231083.
-
- Babineau, M., N. Havron, I. Dautriche, A. de Carvalho, and A. Christophe. 2023. “Learning to Predict and Predicting to Learn: Before and Beyond the Syntactic Bootstrapper.” Language Acquisition 30, no. 3–4: 337–360. https://doi.org/10.1080/10489223.2022.2078211.
-
- Bartsch, K., and H. M. Wellman. 1995. Children Talk About the Mind. Oxford University Press. https://doi.org/10.1093/oso/9780195080056.001.0001.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources