Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Alejandrina Cristia¹, Emmanuel Dupoux¹, Nan Bernstein Ratner², Melanie Soderstrom³

Affiliations

¹ Dept d'Etudes Cognitives, ENS, PSL University, EHESS, CNRS.
² Department of Hearing and Speech Sciences, University of Maryland.
³ Department of Psychology, University of Manitoba.

PMID: 31149647
PMCID: PMC6515859
DOI: 10.1162/opmi_a_00022

Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Alejandrina Cristia et al. Open Mind (Camb). 2019.

. 2019 Feb 1:3:13-22.

doi: 10.1162/opmi_a_00022.

Authors

Alejandrina Cristia¹, Emmanuel Dupoux¹, Nan Bernstein Ratner², Melanie Soderstrom³

Affiliations

¹ Dept d'Etudes Cognitives, ENS, PSL University, EHESS, CNRS.
² Department of Hearing and Speech Sciences, University of Maryland.
³ Department of Psychology, University of Manitoba.

PMID: 31149647
PMCID: PMC6515859
DOI: 10.1162/opmi_a_00022

Abstract

Previous computational modeling suggests it is much easier to segment words from child-directed speech (CDS) than adult-directed speech (ADS). However, this conclusion is based on data collected in the laboratory, with CDS from play sessions and ADS between a parent and an experimenter, which may not be representative of ecologically collected CDS and ADS. Fully naturalistic ADS and CDS collected with a nonintrusive recording device as the child went about her day were analyzed with a diverse set of algorithms. The difference between registers was small compared to differences between algorithms; it reduced when corpora were matched, and it even reversed under some conditions. These results highlight the interest of studying learnability using naturalistic corpora and diverse algorithmic definitions.

Keywords: computational modeling; infant word segmentation; learnability; lexicon; statistical learning.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: None of the authors declare any competing interests.

Figures

<b>Figure 1.</b> — **Figure 1.**
Token F-score (in percentage) achieved by each algorithm in child-directed speech (CDS) as a function of that in adult-directed speech (ADS) in the full Winnipeg corpus with human-set utterance boundaries. Error bars indicate two standard deviations (over 10 resamples; see main text and Supplemental Materials, Cristia, , for details).

See this image and copyright information in PMC

References

1. Aust, F., & Barth, M. (2015). Papaja: Create APA manuscripts with RMarkdown. Retrieved from https://github.com/crsh/papaja
1. Batchelder, E. O. (1997). Computational evidence for the use of frequency information in discovery of the infant’s first lexicon (Unpublished doctoral dissertation). New York: The City University of New York.
1. Batchelder, E. O. (2002). Bootstrapping the lexicon: A computational model of infant speech segmentation. Cognition, 83, 167–206. - PubMed
1. Benders, T. (2013). Mommy is only happy! Dutch mothers’ realisation of speech sounds in infant-directed speech expresses emotion, not didactic intent. Infant Behavior and Development, 36, 847–862. - PubMed
1. Bernard, M., Thiolliere, R., Saksida, A., Loukatou, G., Larsen, E., Johnson, M., … Cristia, A. (2018). WordSeg: Standardizing unsupervised word form segmentation from text. Preprint. Retrieved from https://osf.io/5qkm3/ - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Affiliations

Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources