Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 24;15(1):3419.
doi: 10.1038/s41467-024-47824-1.

Goal-directed and flexible modulation of syllable sequence within birdsong

Affiliations

Goal-directed and flexible modulation of syllable sequence within birdsong

Takuto Kawaji et al. Nat Commun. .

Abstract

Songs constitute a complex system of vocal signals for inter-individual communication in songbirds. Here, we elucidate the flexibility which songbirds exhibit in the organizing and sequencing of syllables within their songs. Utilizing a newly devised song decoder for quasi-real-time annotation, we execute an operant conditioning paradigm, with rewards contingent upon specific syllable syntax. Our analysis reveals that birds possess the capacity to modify the contents of their songs, adjust the repetition length of particular syllables and employing specific motifs. Notably, birds altered their syllable sequence in a goal-directed manner to obtain rewards. We demonstrate that such modulation occurs within a distinct song segment, with adjustments made within 10 minutes after cue presentation. Additionally, we identify the involvement of the parietal-basal ganglia pathway in orchestrating these flexible modulations of syllable sequences. Our findings unveil an unappreciated aspect of songbird communication, drawing parallels with human speech.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Development and evaluation of a quasi-real-time song decoder, SAIBS.
a The operational architecture. In the training phase, syllables were clustered and used for training a convolutional neural network (CNN). In the decoding phase, trained CNN was used to decode the coming audio. t-SNE t-distributed stochastic neighbor embedding, DBSCAN density-based spatial clustering of application with noise. b, c Example of syllables automatically clustered by SAIBS (b) and their detection in a song (c). d Annotation comparison against TweetyNet. Songs from one bird were annotated by SAIBS, and the results were compared with those by TweetyNet. The matrix shows the mean match rate between each decoder for each syllable. e The concordance rate for each syllable is shown with rates of insert-type and deletion-type errors. Mean ± s.e.m. from n = 4 trial.
Fig. 2
Fig. 2. Operant conditioning of syllable repetition.
a The repetition of the target syllables in songs was analyzed online by SAIBS. If the repetition number of the target syllable “c” exceeds the predetermined threshold “x”, the reward movie was presented from a monitor. b Example of song spectrograms before and after the conditioning (Top). The daily shift of max repetition counts in nine birds (Below). Mean ± s.e.m. ce Change of syllable repetition with conditioning (c) and without conditioning (d), and with conditioning with non-social feedback movie (e). The box plot shows median and first and third quantiles with mean shown as circles; P values, two-sided paired t-test, t(8) = −10.5, n = 9 (c), t(4) = −0.384, n = 5 (d), and t(4) = −1.15, n = 5 (e). f Shift of song rate of songs of each repetition before and after conditioning. Pink shadings, the rewarded range. Mean ± s.e.m.; P values, two-sided paired t-test (left, middle), t(5) = 3.41, n = 6. The summary (left, middle) and results from each bird (right) are shown. The reward range is highlighted in pink. The heatmap shows the song rate change (%) for each bird.
Fig. 3
Fig. 3. Song change occurs at a defined locus within the song.
a Markov diagrams of syllable transition (bi-gram) before and after the conditioning of syllable repetition. b Tri-gram syllable transition matrix before (left) and after (middle) the conditioning, and the difference of before and after (right). This subject was conditioned to increase syllable repetition “c”. c Spectrograms of songs before and after conditioning. The repetition occurred at three loci of single song bout in this example. d Change of repetition locus where the maximum repetition occurs in a song bout. The heatmap shows the song rate change (%) before and after conditioning for each bird.
Fig. 4
Fig. 4. Context-dependent modulation of song contents.
a Experimental scheme of devaluation training. “Reward” day and “Excess-reward” day were repeated each for 3 days. b The shift of repetition, t(4) = 3.18, n = 5 birds. c Experiment scheme. Reward day and “No-reward” day were repeated for 3 days. d The shift of repetition number, t(4) = 7.25, n = 5 birds. e The shift of repetition across days of “Reward” days (left, red) and “No-reward” days (right, blue) within the 3-day period. f Scheme of training and test sessions. Repetition training was conducted with a colored frame on the monitor. g, h The shift of repetition number in the test sessions when the rule was changed for a 1-day cycle (g) and 1-h cycle (h), t(4) = −4.55 and −4.21, n = 5 birds. i Shift of the ratio of repetition number in the “Reward color” condition against “No-reward color” condition in the 1-h cycle. The ratio of early sessions (10:00–15:00) and late (17:00–22:00) sessions within the days are shown. Throughout the panels, the box plot shows the median and first and third quantiles with the mean shown as a circle, P values, two-sided paired t-test; t(4) = 2.16, 1.71, −0.347 from day 1 to day 3; n = 5 birds.
Fig. 5
Fig. 5. Ablation of the brain nucleus affects the flexible modulation of songs.
a Overview of the experiment. b An example of a histological section from a bird received the nucleus ablation by ibotenic acid. The signals from fluorescent immunostaining against Fox3 (magenta), are shown with DAPI (green). A diagram of the AFP pathway is overlayed on the right. Major innervations within the nucleus are shown with dotted arrows. HVC (letter-based name); RA robust nucleus of the arcopallium, DLM medial-lateral nucleus of the dorsal thalamus, LMAN lateral magnocellular nucleus of the anterior nidopallium. Scale bar, 1 mm. c Tri-gram syllable transition probability matrix before and after the ablation. d Syllable repetition before and after the ablation. The box plot shows the median and first and third quantiles, with the mean shown as a circle. P values, Tukey’s HSD-test, two-sided, t(599) = 9.31, t(599) = 12.5, t(599) = 0.259 from left to right, n = 6 birds.
Fig. 6
Fig. 6. Syntactic organization affects the flexible modulation of song contents.
a Overview of the morphological analysis. b Perplexity results with the different number of latticelm-segmented motifs compared with randomly-segmented motifs. The perplexity in predicting the upcoming syllable of birds having a total of 14 syllables are shown. c Log-log plot of rank-frequency distribution of motifs. The rank of motif usage in the song corpus, and the occurrence of that motif are shown. The selected alternative targets are shown with green and purple lines (high similarity pair) and dots (low similarity pair). d Transition diagram and sonograms of bird #7, having “f-bca” (left) and “da-bca” (right) transition. e Alternative motif usage conditioning in the convergent locus. The frequency of the utterance of the target motif is shown in rules A (purple) and B (green). f Percent change of frequency of the alternative motif “f-bca” over “da-bca”. g Percent change of frequency of syllable “f” and “da” excluding those used as “f-bca” and “da-bca”. h, i Conditioning of motif pair with low similarity. jn Example of another bird that learned to use an alternative motif with a high similarity score in the divergent locus (jl), but not with low similarity motifs (m, n). Throughout the panels, plots show mean ± s.e.m., n = 100 bouts; P values, Tukey HSD-test, two-sided; t(99) = 8.54 (f), t(99) = 1.77 (g), t(99) = 1.64 (i), t(99) = 6.53 (k), t(99) = 0.463 (l), t(99) = 1.83 (n). The dotted lines indicate the ratios before the conditioning that were used for normalization.

Similar articles

Cited by

References

    1. Beecher, M. D. Why are no animal communication systems simple languages? Front. Psychol.12, 602635 (2021). - PMC - PubMed
    1. Rendall D. Q&A: cognitive ethology - inside the minds of other species. BMC Biol. 2013;11:108. doi: 10.1186/1741-7007-11-108. - DOI - PMC - PubMed
    1. Nieder A, Mooney R. The neurobiology of innate, volitional and learned vocalizations in mammals and birds. Philos. Trans. R. Soc. B: Biol. Sci. 2020;375:20190054. doi: 10.1098/rstb.2019.0054. - DOI - PMC - PubMed
    1. Templeton CN, Greene E, Davis K. Allometry of alarm calls: black-capped chickadees encode information about predator size. Science. 2005;308:1934–1937. doi: 10.1126/science.1108841. - DOI - PubMed
    1. Trillo PA, Vehrencamp SL. Song types and their structural features are associated with specific contexts in the banded wren. Anim. Behav. 2005;70:921–935. doi: 10.1016/j.anbehav.2005.02.004. - DOI - PMC - PubMed

Publication types