. 2021 Jan 12;11(1):776.

doi: 10.1038/s41598-020-80340-y.

Universal principles underlying segmental structures in parrot song and human speech

Dan C Mann^{1

2}, W Tecumseh Fitch³, Hsiao-Wei Tu⁴, Marisa Hoeschele^{3

5}

Affiliations

¹ Linguistics Program, The Graduate Center of the City University of New York, New York City, USA. daniel.mann@univie.ac.at.
² Department of Cognitive Biology, University of Vienna, Vienna, Austria. daniel.mann@univie.ac.at.
³ Department of Cognitive Biology, University of Vienna, Vienna, Austria.
⁴ Department of Psychology, University of Maryland, College Park, USA.
⁵ Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria.

PMID: 33436874
PMCID: PMC7804275
DOI: 10.1038/s41598-020-80340-y

Universal principles underlying segmental structures in parrot song and human speech

Dan C Mann et al. Sci Rep. 2021.

. 2021 Jan 12;11(1):776.

doi: 10.1038/s41598-020-80340-y.

Authors

Dan C Mann^{1

2}, W Tecumseh Fitch³, Hsiao-Wei Tu⁴, Marisa Hoeschele^{3

5}

Affiliations

¹ Linguistics Program, The Graduate Center of the City University of New York, New York City, USA. daniel.mann@univie.ac.at.
² Department of Cognitive Biology, University of Vienna, Vienna, Austria. daniel.mann@univie.ac.at.
³ Department of Cognitive Biology, University of Vienna, Vienna, Austria.
⁴ Department of Psychology, University of Maryland, College Park, USA.
⁵ Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria.

PMID: 33436874
PMCID: PMC7804275
DOI: 10.1038/s41598-020-80340-y

Abstract

Despite the diversity of human languages, certain linguistic patterns are remarkably consistent across human populations. While syntactic universals receive more attention, there is stronger evidence for universal patterns in the inventory and organization of segments: units that are separated by rapid acoustic transitions which are used to build syllables, words, and phrases. Crucially, if an alien researcher investigated spoken human language how we analyze non-human communication systems, many of the phonological regularities would be overlooked, as the majority of analyses in non-humans treat breath groups, or "syllables" (units divided by silent inhalations), as the smallest unit. Here, we introduce a novel segment-based analysis that reveals patterns in the acoustic output of budgerigars, a vocal learning parrot species, that match universal phonological patterns well-documented in humans. We show that song in four independent budgerigar populations is comprised of consonant- and vowel-like segments. Furthermore, the organization of segments within syllables is not random. As in spoken human language, segments at the start of a vocalization are more likely to be consonant-like and segments at the end are more likely to be longer, quieter, and lower in fundamental frequency. These results provide a new foundation for empirical investigation of language-like abilities in other species.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Complex vocalizations of the human and the budgerigar. In human language, silence is not a reliable cue for word, syllable, or segment boundaries. For instance, the complex and novel phrase “the zealous sailors sail all seven seas and all four oceans” can be spoken without any intervening silence, as shown in the spectrogram (A). Budgerigar song seems to share this property. Budgerigar song is comprised of complex (B: syllables 3 and 5) and simple (B: syllables 1, 2, 4, 6, 7) syllable types. The complex syllables (C), however, seem to be composed of segments which are similar to the simple syllables. We created the image using R and the packages *cowplot, ggplot2, ggpubr, seewave*, and *viridis*^–.

**Figure 2**
Segmentation and segment validation. (A), an example of an output of our segmentation algorithm on human speech, in this case, an English vocal breath group, “four oceans”. (B), an example the segmentation algorithm applied to budgerigar song. We scaled the algorithm to the different species by making the algorithm dependent on a minimum fundamental frequency setting. For each cluster size of human speech (C), silhouette widths were higher for segments than for syllables ( $\bar{x}$ = 0.42–0.25). For the budgerigar units, we included simple syllables and random snippets of syllables that were equally long as the segments as controls (D). Silhouette width scores for segments were statistically different from complex syllables ( $\bar{x}$ = 0.32–0.13, V = *105, p* < *0.001*) and the random snippets ( $\bar{x}$ = 0.32–0.22, V = *105, p* < *0.001*). We created the image using Praat, R, and the R packages *ggplot2* and *magick*^,,,.

**Figure 3**
Budgerigar vowels and consonants. (A) A hierarchical clustering of budgerigar segments reveals two clear clusters of segments. (B) Using the clustering, we found that one cluster is comprised of periodic, vowel-like units while the other is made up of aperiodic, consonant-like units. The former is, on average, longer, louder, and more periodic than the latter. We created the image using R and the packages *factoextra* and *ggplot2*^,,.

**Figure 4**
Edge effects in budgerigar song. (A) Canonical syllable with aperiodic burst onset, long and low final segment, and a rise-and-fall intensity contour. All groups shared similar positional biases. For all individuals, mean F0 measurements were lower for segments in syllable-final position when compared to medial segments (mean of individual means: n = 14, Medial: $\bar{x}$ = 2522 Hz, s = ± 181 Hz ~ Final: 2069 ± 165); final segments were, on average, longer in duration than medial segments (n = 14, Medial: $\bar{x}$ = 6.5 ms, s = 0.84 ms ~ Final: 12, ± 4.84); intensity was lowest in initial position (Initial: $\bar{x}$ = 46.7 dB s = 3.71 dB ~ Medial: 56.4 ± 4.03 ~ Final: 52.8 ± 4.46); and periodicity was lowest in initial position with slightly more than a third of segments having periodic vibration (n = 14; $\bar{x}$ = 34.3%, s = 10.5% ~ Medial: 74%, ± 4.9 ~ Final: 54.9%, ± 10.5%). For all four acoustic measurements, a mixed-effect model with position as a fixed effect performed better than a model without position (F0: X² = *2353.4, df* = *2, p* < *0.001*; duration: X² = *3209.6, df* = *2, p* < *0.001*; intensity: X² = *22,186, df* = *2, p* < *0.001*; periodicity: X² = *10,158, df* = *2, p* < *0.001*). (B) The prevalence of aperiodicity in syllable-initial positions and periodicity in syllable-medial positions suggests that budgerigars share the human segment organizational preference (C) for aperiodic onsets followed by periodic signals. We created the figure using R and the packages *cowplot*, *ggpubr*, *seewave* and *viridis*^–.

See this image and copyright information in PMC

Cited by

Beat-based dancing to music has evolutionary foundations in advanced vocal learning.
Patel AD. Patel AD. BMC Neurosci. 2024 Nov 6;25(1):65. doi: 10.1186/s12868-024-00843-6. BMC Neurosci. 2024. PMID: 39506663 Free PMC article. Review.
Voice modulatory cues to structure across languages and species.
Matzinger T, Fitch WT. Matzinger T, et al. Philos Trans R Soc Lond B Biol Sci. 2021 Dec 20;376(1840):20200393. doi: 10.1098/rstb.2020.0393. Epub 2021 Nov 1. Philos Trans R Soc Lond B Biol Sci. 2021. PMID: 34719253 Free PMC article. Review.
Detecting surface changes in a familiar tune: exploring pitch, tempo and timbre.
Crespo-Bojorque P, Celma-Miralles A, Toro JM. Crespo-Bojorque P, et al. Anim Cogn. 2022 Aug;25(4):951-960. doi: 10.1007/s10071-022-01604-w. Epub 2022 Feb 9. Anim Cogn. 2022. PMID: 35138480 Free PMC article.
Convergent vocal representations in parrot and human forebrain motor networks.
Yang Z, Long MA. Yang Z, et al. Nature. 2025 Apr;640(8058):427-434. doi: 10.1038/s41586-025-08695-8. Epub 2025 Mar 19. Nature. 2025. PMID: 40108457
Lessons learned in animal acoustic cognition through comparisons with humans.
Hoeschele M, Wagner B, Mann DC. Hoeschele M, et al. Anim Cogn. 2023 Jan;26(1):97-116. doi: 10.1007/s10071-022-01735-0. Epub 2022 Dec 27. Anim Cogn. 2023. PMID: 36574158 Free PMC article. Review.

See all "Cited by" articles

References

1. Kershenbaum A, et al. Acoustic sequences in non-human animals: a tutorial review and prospectus. Biol. Rev. 2014;91:13–52. doi: 10.1111/brv.12160. - DOI - PMC - PubMed
1. Hyman LM. Universals in phonology. Linguist. Rev. 2008;25:83–137. doi: 10.1515/TLIR.2008.003. - DOI
1. Ladefoged P, Maddieson I. The Sounds of the World’s Languages. New York: Blackwell Publishers; 1996.
1. Lindblom B, Maddieson I. Phonetic universals in consonant systems. In: Li C, Hyman LM, editors. Language, Speech and Mind. London: Routledge; 1988. pp. 62–78.
1. Maddieson I. Patterns of Sounds. Cambridge Studies in Speech Science and Communication. Cambridge: Cambridge University Press; 1984.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Universal principles underlying segmental structures in parrot song and human speech

Affiliations

Universal principles underlying segmental structures in parrot song and human speech

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials