Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2020 Apr;52(2):641-653.
doi: 10.3758/s13428-019-01265-7.

Look who's talking: A comparison of automated and human-generated speaker tags in naturalistic day-long recordings

Affiliations
Comparative Study

Look who's talking: A comparison of automated and human-generated speaker tags in naturalistic day-long recordings

Federica Bulgarelli et al. Behav Res Methods. 2020 Apr.

Abstract

The LENA system has revolutionized research on language acquisition, providing both a wearable device to collect day-long recordings of children's environments, and a set of automated outputs that process, identify, and classify speech using proprietary algorithms. This output includes information about input sources (e.g., adult male, electronics). While this system has been tested across a variety of settings, here we delve deeper into validating the accuracy and reliability of LENA's automated diarization, i.e., tags of who is talking. Specifically, we compare LENA's output with a gold standard set of manually generated talker tags from a dataset of 88 day-long recordings, taken from 44 infants at 6 and 7 months, which includes 57,983 utterances. We compare accuracy across a range of classifications from the original Lena Technical Report, alongside a set of analyses examining classification accuracy by utterance type (e.g., declarative, singing). Consistent with previous validations, we find overall high agreement between the human and LENA-generated speaker tags for adult speech in particular, with poorer performance identifying child, overlap, noise, and electronic speech (accuracy range across all measures: 0-92%). We discuss several clear benefits of using this automated system alongside potential caveats based on the error patterns we observe, concluding with implications for research using LENA-generated speaker tags.

Keywords: LENA system; LENA system reliability; Talker variability.

PubMed Disclaimer

Figures

Figure 1 .
Figure 1 .
Confusion matrix displaying recall for LENA-generated labels compared to Human-generated labels. Each column constitutes all of the instances labeled by human coders as belonging to that category. Each cell displays how LENA software tags were labeled for each human category, as well the total number of segments in each cell. Darker colors represent a higher proportion of LENA software tags.
Figure 2 .
Figure 2 .
Classification accuracy distribution by utterance-type across the four main categories: adult, child, electronic or overlap. The box plot reflects the median of the means for each infant for each utterance-type. Each point (jittered horizontally) represents one child; diamonds (unjittered) indicate outliers.
Figure 3 .
Figure 3 .
Confusion matrix displaying proportion correct (i.e. recall) for LENA-generated labels compared to Human-generated labels. Each column constitutes all of the instances labeled by human coders. Each cell displays how the LENA system tags were labeled for each human category, as well the total number of segments in each cell. Darker colors represent a higher proportion of LENA system tags.
Figure 4 .
Figure 4 .
Classification accuracy distribution by utterance-type. Each point (jittered horizontally) represents one child; diamonds (unjittered) indicate outliers. N.B. not all participants contributed data to each utterance type for each comparison.

References

    1. Aust F and Barth M (2018). papaja: Create APA manuscripts with R Markdown R package version 0.1.0.9842.
    1. Bache SM and Wickham H (2014). magrittr: A Forward-Pipe Operator for R R package version 1.5.
    1. Bergelson E (2017). Bergelson Seedlings HomeBank Corpus
    1. Bergelson E, Amatuni A, Dailey S, Koorathota S, and Tor S (2018a). Day by day , hour by hour : Naturalistic language input to infants. Developmental Science, (June):1–10. - PMC - PubMed
    1. Bergelson E and Aslin RN (2017). Nature and origins of the lexicon in 6-mo-olds. Proceedings of the National Academy of Sciences, page 201712966. - PMC - PubMed

Publication types

LinkOut - more resources