Everyday language input and production in 1,001 children from six continents

Elika Bergelson¹, Melanie Soderstrom², Iris-Corinna Schwarz^{3

4}, Caroline F Rowland^{5

6

7}, Nairán Ramírez-Esparza⁸, Lisa R Hamrick⁹, Ellen Marklund³, Marina Kalashnikova^{10

11}, Ava Guez¹², Marisa Casillas^{5

7

13}, Lucia Benetti¹⁴, Petra van Alphen¹⁵, Alejandrina Cristia¹²

Affiliations

¹ Department of Psychology, Harvard University, Cambridge, MA 02138, United Kingdom.
² Department of Psychology, University of Manitoba, Winnipeg, CA R3T 2N2.
³ Department of Linguistics, Stockholm University, Stockholm SE-106 91, Sweden.
⁴ Department of Special Education, Stockholm University, Stockholm SE-106 91, Sweden.
⁵ Language Development Department, Max Planck Institute for Psycholinguistics, Nijmegen 6525 XD, Netherlands.
⁶ Donders Centre for Brain, Cognition and Behaviour, Radboud University, Nijmegen 6525 XZ, Netherlands.
⁷ Australian Research Council Centre of Excellence for the Dynamics of Language, Australian National University, ACT 2601, Australia.
⁸ Psychological Sciences, University of Connecticut, Storrs, CT 06268.
⁹ Department of Psychological Sciences, Purdue University, West Lafayette, IN 47907.
¹⁰ Basque Center on Cognition Brain and Language, Donostia-San Sebastian 20009, Spain.
¹¹ Ikerbasque - Basque Foundation of Science, Bilbao 48009, Spain.
¹² Départment d'études Cognitives, École normale supérieure, École des hautes études en sciences sociales, Centre National de la Recherche Scientifique, PSL University, Laboratoire de Sciences Cognitives et Psycholinguistique, Paris 75005, France.
¹³ Comparative Human Development Department, University of Chicago, Chicago, IL 60637.
¹⁴ School of Music, Ohio State University, Columbus, OH 43210.
¹⁵ Royal Dutch Kentalis Utrecht, Utrecht 3527 JP, Netherlands.

PMID: 38085754
PMCID: PMC10756310
DOI: 10.1073/pnas.2300671120

Everyday language input and production in 1,001 children from six continents

Elika Bergelson et al. Proc Natl Acad Sci U S A. 2023.

. 2023 Dec 26;120(52):e2300671120.

doi: 10.1073/pnas.2300671120. Epub 2023 Dec 12.

Authors

Affiliations

¹ Department of Psychology, Harvard University, Cambridge, MA 02138, United Kingdom.
² Department of Psychology, University of Manitoba, Winnipeg, CA R3T 2N2.
³ Department of Linguistics, Stockholm University, Stockholm SE-106 91, Sweden.
⁴ Department of Special Education, Stockholm University, Stockholm SE-106 91, Sweden.
⁵ Language Development Department, Max Planck Institute for Psycholinguistics, Nijmegen 6525 XD, Netherlands.
⁶ Donders Centre for Brain, Cognition and Behaviour, Radboud University, Nijmegen 6525 XZ, Netherlands.
⁷ Australian Research Council Centre of Excellence for the Dynamics of Language, Australian National University, ACT 2601, Australia.
⁸ Psychological Sciences, University of Connecticut, Storrs, CT 06268.
⁹ Department of Psychological Sciences, Purdue University, West Lafayette, IN 47907.
¹⁰ Basque Center on Cognition Brain and Language, Donostia-San Sebastian 20009, Spain.
¹¹ Ikerbasque - Basque Foundation of Science, Bilbao 48009, Spain.
¹² Départment d'études Cognitives, École normale supérieure, École des hautes études en sciences sociales, Centre National de la Recherche Scientifique, PSL University, Laboratoire de Sciences Cognitives et Psycholinguistique, Paris 75005, France.
¹³ Comparative Human Development Department, University of Chicago, Chicago, IL 60637.
¹⁴ School of Music, Ohio State University, Columbus, OH 43210.
¹⁵ Royal Dutch Kentalis Utrecht, Utrecht 3527 JP, Netherlands.

PMID: 38085754
PMCID: PMC10756310
DOI: 10.1073/pnas.2300671120

Abstract

Language is a universal human ability, acquired readily by young children, who otherwise struggle with many basics of survival. And yet, language ability is variable across individuals. Naturalistic and experimental observations suggest that children's linguistic skills vary with factors like socioeconomic status and children's gender. But which factors really influence children's day-to-day language use? Here, we leverage speech technology in a big-data approach to report on a unique cross-cultural and diverse data set: >2,500 d-long, child-centered audio-recordings of 1,001 2- to 48-mo-olds from 12 countries spanning six continents across urban, farmer-forager, and subsistence-farming contexts. As expected, age and language-relevant clinical risks and diagnoses predicted how much speech (and speech-like vocalization) children produced. Critically, so too did adult talk in children's environments: Children who heard more talk from adults produced more speech. In contrast to previous conclusions based on more limited sampling methods and a different set of language proxies, socioeconomic status (operationalized as maternal education) was not significantly associated with children's productions over the first 4 y of life, and neither were gender or multilingualism. These findings from large-scale naturalistic data advance our understanding of which factors are robust predictors of variability in the speech behaviors of young learners in a wide range of everyday contexts.

Keywords: human diversity; infancy; language; socioeconomic status; speech.

PubMed Disclaimer

Conflict of interest statement

Competing interests statement:The authors declare no competing interest.

Figures

**Fig. 1.**
Geographical location, primary language, number of children (N $_{CHILD}$ ), number of recordings (N $_{REC}$ ), and data citation for each corpus.

**Fig. 2.**
Effects of adult talk, child age, and normative development on children’s speech production. Points show each daylong recording; lines show linear regression with 95% CIs. Child speech is quantified as child linguistic vocalization rate; adult talk as adult vocalization count rate (AVCr). (A) Child speech by age, split by low/mid/high tertiles of adult talk. Lines depict significant adult talk $\times$ age interaction. Color-shape combinations show each unique corpus, numbered to preserve anonymity. (B) Child speech by age and normative status. Lines depict significant age $\times$ normative status interaction. (C) Proportion of vocal behavior classified as speech, cry, or vegetative, by age. The line type/color indicates monolingual and normative statuses. N.B. Monolingual normative CI (blue) falls fully within that for multilingual children (pink) for all three types of vocal behavior, highlighting these groups’ equivalent patterns.

**Fig. 3.**
Factors that do not predict child speech or adult talk. Points = individual recordings, jittered horizontally. Lines = linear fit with 95% confidence intervals. Error bars = 99% bootstrapped CIs of sample means. Child speech is quantified as child linguistic vocalization rate; adult talk as adult vocalization count rate (AVCr). **A & B**: null effects of child gender (A) and socioeconomic status (SES) (B) on child speech. (C) Null three-way effect of normative development $\times$ adult talk $\times$ age (N.B.: normative $\times$ age and adult talk $\times$ age are significant; see Fig. 2). (D) null three-way effect of age $\times$ adult talk $\times$ monolingual status. (E and F) null effects of child gender (E) and SES (F) on adult talk. (G and H) null effect of normative development (G) and monolingual status (H) on adult talk.

**Fig. 4.**
Child speech as a function of SES within individual corpora. SES = maternal education levels as in Table 1. White lines = linear fit with 95% CIs in color, color = corpus. Black lines = 99% CIs of sample means bootstrapped separately from linear fit for each level of SES. These data (as well as our main models and further analyses in SI 3H/G) do not reveal an SES effect on child speech.

**Fig. 5.**
Sample demographics. Number of daylong recordings (*Top row*) and children (*Bottom row*) in the full dataset across demographic variables. For socioeconomic status (SES), $<$ H.S. = less than high school degree, H.S. = high school degree, S.U. = some university, B.A. = bachelor’s degree, >B.A. = advanced degree. For child gender, F = female, M = male. For monolingual status (monoling.), Y = monolingual, N = not monolingual. For normative development (norm.), Y = normative, N = nonnormative.

See this image and copyright information in PMC

References

1. Pinker S., The Language Instinct (Morrow, New York, NY, 1994).
1. Oller D. K., et al. , Infant boys are more vocal than infant girls. Curr. Biol. 30, R426–R427 (2020). - PMC - PubMed
1. Fernald A., Marchman V. A., Weisleder A., SES differences in language processing skill and vocabulary are evident at 18 months. Dev. Sci. 16, 234–248 (2013). - PMC - PubMed
1. Gilkerson J., et al. , Mapping the early language environment using all-day recordings and automated analysis. Am. J. Speech Lang. Pathol. 26, 248–265 (2017). - PMC - PubMed
1. R. Coe, It’s the effect size, stupid (2002). https://f.hubspotusercontent30.net/hubfs/5191137/attachments/ebe/ESguide.... Accessed 28 July 2021.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Everyday language input and production in 1,001 children from six continents

Affiliations

Everyday language input and production in 1,001 children from six continents

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials