Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 1:3:13-22.
doi: 10.1162/opmi_a_00022.

Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Affiliations

Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus

Alejandrina Cristia et al. Open Mind (Camb). .

Abstract

Previous computational modeling suggests it is much easier to segment words from child-directed speech (CDS) than adult-directed speech (ADS). However, this conclusion is based on data collected in the laboratory, with CDS from play sessions and ADS between a parent and an experimenter, which may not be representative of ecologically collected CDS and ADS. Fully naturalistic ADS and CDS collected with a nonintrusive recording device as the child went about her day were analyzed with a diverse set of algorithms. The difference between registers was small compared to differences between algorithms; it reduced when corpora were matched, and it even reversed under some conditions. These results highlight the interest of studying learnability using naturalistic corpora and diverse algorithmic definitions.

Keywords: computational modeling; infant word segmentation; learnability; lexicon; statistical learning.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: None of the authors declare any competing interests.

Figures

<b>Figure 1.</b>
Figure 1.
Token F-score (in percentage) achieved by each algorithm in child-directed speech (CDS) as a function of that in adult-directed speech (ADS) in the full Winnipeg corpus with human-set utterance boundaries. Error bars indicate two standard deviations (over 10 resamples; see main text and Supplemental Materials, Cristia, , for details).

References

    1. Aust, F., & Barth, M. (2015). Papaja: Create APA manuscripts with RMarkdown. Retrieved from https://github.com/crsh/papaja
    1. Batchelder, E. O. (1997). Computational evidence for the use of frequency information in discovery of the infant’s first lexicon (Unpublished doctoral dissertation). New York: The City University of New York.
    1. Batchelder, E. O. (2002). Bootstrapping the lexicon: A computational model of infant speech segmentation. Cognition, 83, 167–206. - PubMed
    1. Benders, T. (2013). Mommy is only happy! Dutch mothers’ realisation of speech sounds in infant-directed speech expresses emotion, not didactic intent. Infant Behavior and Development, 36, 847–862. - PubMed
    1. Bernard, M., Thiolliere, R., Saksida, A., Loukatou, G., Larsen, E., Johnson, M., … Cristia, A. (2018). WordSeg: Standardizing unsupervised word form segmentation from text. Preprint. Retrieved from https://osf.io/5qkm3/ - PubMed

LinkOut - more resources