Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 Dec 15:198:152-70.
doi: 10.1016/j.neuroscience.2011.09.069. Epub 2011 Oct 13.

A hypothesis for basal ganglia-dependent reinforcement learning in the songbird

Affiliations
Review

A hypothesis for basal ganglia-dependent reinforcement learning in the songbird

M S Fee et al. Neuroscience. .

Erratum in

  • Neuroscience. 2013 Dec 26;255:301

Abstract

Most of our motor skills are not innately programmed, but are learned by a combination of motor exploration and performance evaluation, suggesting that they proceed through a reinforcement learning (RL) mechanism. Songbirds have emerged as a model system to study how a complex behavioral sequence can be learned through an RL-like strategy. Interestingly, like motor sequence learning in mammals, song learning in birds requires a basal ganglia (BG)-thalamocortical loop, suggesting common neural mechanisms. Here, we outline a specific working hypothesis for how BG-forebrain circuits could utilize an internally computed reinforcement signal to direct song learning. Our model includes a number of general concepts borrowed from the mammalian BG literature, including a dopaminergic reward prediction error and dopamine-mediated plasticity at corticostriatal synapses. We also invoke a number of conceptual advances arising from recent observations in the songbird. Specifically, there is evidence for a specialized cortical circuit that adds trial-to-trial variability to stereotyped cortical motor programs, and a role for the BG in "biasing" this variability to improve behavioral performance. This BG-dependent "premotor bias" may in turn guide plasticity in downstream cortical synapses to consolidate recently learned song changes. Given the similarity between mammalian and songbird BG-thalamocortical circuits, our model for the role of the BG in this process may have broader relevance to mammalian BG function.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Song development and underlying brain circuitry
(A) Song spectrogram of the song of an adult zebra finch ‘tutor’. Note the stereotyped repetition of the syllable sequence ‘abc’. (B) Spectrograms showing the gradual evolution of a juvenile bird's song from highly variable ‘babble’-like subsong (top, 40 days post hatch, dph), to the incorporation of moderate temporal structure in plastic song (dph 60), and finally to the crystallized song of young adulthood (dph 90). Note imitation of the tutor ‘abc’ syllable sequence. (C) Schematic of the avian song system. The avian pallium is related to mammalian cortex (Jarvis, 2004), and we refer to pallial structures as ‘cortical.’ The motor pathway (dotted lines) is formed by the projection from HVC to RA. A second input to RA comes from LMAN (lateral magnocellular nucleus of the anterior nidopallium). LMAN has been envisioned as a frontal-like cortical nucleus because of its anterior location in the avian pallium and because of its inputs from the BG-recipient thalamus (Jarvis, 2004). (D) LMAN, Area X and DLM constitute a cortical-basal ganglia-thalamocortical loop called the anterior forebrain pathway (AFP). Area X is homologous to mammalian basal ganglia.
Figure 2
Figure 2. Vocal variability in juvenile birds requires LMAN, the cortical output of a BG-thalamocortical loop
(A) Spectrogram showing the highly variable song of a young juvenile zebra finch (age 45 dph). Syllable segments (horizontal bars) and sound amplitudes (bottom) are shown below. (B) Song of the same bird during pharmacological inactivation of LMAN. Note the highly stereotyped syllable and gap durations, and the stereotyped acoustic structure within syllables; image adapted from (Goldberg and Fee, 2011). (C) Song is generated by two interacting premotor pathways. Subsong is highly variable and primarily driven by the DLM→LMAN→RA pathway (left). Adult song is highly stereotyped and driven primarily by sequential activity from HVC, which also requires inputs from nucleus Uvaeformis (Uva) in the dorsal thalamus (right). Plastic song has both variable and stereotyped components and is driven jointly LMAN and HVC (center). During learning, control of song is gradually transferred from the LMAN to the HVC pathway.
Figure 3
Figure 3. LMAN generates premotor bias in a novel song operant conditioning task
(A) Schematic of the conditional feedback protocol. (A,a), spectrogram of targeted syllable. (A,b) A measure of pitch is computed continuously (black curve). Whenever the pitch falls above a threshold (blue region) white noise is played to the bird. The threshold is positioned in the center of the pitch distribution of the targeted region (green curve). (A,c) Spectrogram of a syllable ‘hit’ with white noise. (B) The pitch time course within the targeted harmonic stack for 20 consecutive renditions at 3 time periods during learning: immediately after the feedback was turned on (0 hours) and 2 and 4 hours later. (C) Each dot represents the average pitch of one rendition of the targeted syllable during a day of learning. At the end of the day, TTX was infused into LMAN, resulting in an immediate ‘unlearning’ of the day's song changes (gray dots, pre-TTX; red dots, post-TTX). (D) Time series showing the average pitch of the targeted syllable with LMAN intact (gray dots) and following LMAN inactivations (red dots) during successive days of conditional feedback. Pitch threshold (blue shading) was regularly updated to continually enforce learning. Note that the pitch changes in LMAN-inactivated song are consistently one day behind the LMAN-intact song, suggesting that they are ‘consolidated’ in the motor pathway with a delay. (E) Correlations coefficients were computed between the magnitude of the LMAN-dependent pitch change, and the magnitude of the ‘consolidation,’ computed as the difference between successive LMAN inactivations. Correlation coefficients plotted as a function of time lag (days), indicating that the amount of consolidation in the motor pathway is strongly predicted by the amount of bias that was generated the 1 day earlier. Errors bars are 95% confidence intervals. (Images reproduced from Andalman and Fee (2009).
Figure 4
Figure 4. Putative mediums spiny neurons in Area X and their inputs from LMAN and HVC
(A) Raster plot showing the spike patterns of an LMAN neuron during singing aligned to 50 consecutive motif renditions of a juvenile bird (dph 67), spectrogram of motif at top. Note the trial-to-trial variability in the timing of LMAN spiking, but also a slight tendency to burst at particular times. (B) Schematic of distinct axon terminal arborizations of the two corticostriatal projections to Area X. LMAN axons terminals are highly localized and topographically organized in Area X, while HVC axons terminate globally. (C) Raster plot showing the spike patterns of 7 Area–X projecting HVC neurons during singing (Image adapted from Kozhevnikov and Fee (2007). The activity of each neuron is shown for several (>6) successive motifs, spectrogram at top. Note that HVC(X) neurons exhibit sparse, highly reliable spiking that is time-locked to specific times of the motif. (D-F) Activity of putative medium spiny neurons (MSNs) in Area X during singing (Images from Goldberg and Fee, 2010). (D) The voltage trace of a putative MSN and its instantaneous firing rate are plotted beneath the spectrogram (age 64 dph). Note that this neuron spikes only during syllable “a” of a 3-syllable motif. (E) Top, expanded view of the voltage and spectrogram from the 1st motif from D (indicated by red bar). Middle, raster plot showing spike patterns during 73 renditions of the motif. Bottom, rate histogram compiled from the raster plot. (F) Spectrogram and raster plot of 6 putative MSNs neurons recorded in one bird (61–65 dph). Each neuron exhibits sparse activity temporally localized to distinct parts of a 3-syllable motif.
Figure 5
Figure 5. A model of premotor bias generated by reward-modulated plasticity at corticostriatal synapses
(A) A schematic of the model for a 5 time-step ‘song.’ Five MSNs from a localized region of Area X are shown. Each MSN receives three inputs: (1) convergent input from a local subset of LMAN neurons, represented by one LMAN neuron in the diagram. (2) input from one of 5 HVC(X) neurons, each of which is active at one moment of a ‘song.’ (3) a time-dependent global reward signal from VTA. Each MSN feeds back to activate the same LMAN neuron through pallidothalamic circuitry. Note that in the schematic, many HVC neurons project to this localized region of Area X, due to the divergence in the HVC→projection. (B) Top, a chain of HVC(X) neurons discharge sequentially through each of the 5 moments of the song. LMAN can be biased to discharge at times 2 and 4 if the HVC(X) neurons active at those time points can activate MSNs at times 2 and 4. (C) Schematic of the three inputs to an MSN neuron: VTA, HVC and LMAN. Before learning, the HVC-MSN synapse is weak. (D) Schematic of an ‘empiric synapse’ learning rule (Fiete et al., 2007). If an LMAN neuron bursts at time T it produces an eligibility trace that ‘tags’ that synapse (Sutton and Barto, 1998, Redondo and Morris, 2011). If this LMAN activity results in a better-than-expected outcome, it is followed by a positive reward signal from VTA. (E) A consistent correlation between reward and eligibility trace strengthens the HVC-to-MSN synapse for this MSN.

Similar articles

Cited by

References

    1. Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 1990;13:266–271. - PubMed
    1. Alexander GE, DeLong MR. Microstimulation of the primate neostriatum. I. Physiological properties of striatal microexcitable zones. J Neurophysiol. 1985;53:1401–1416. - PubMed
    1. Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu Rev Neurosci. 1986;9:357–381. - PubMed
    1. Alvarez-Buylla A, Theelen M, Nottebohm F. Birth of projection neurons in the higher vocal center of the canary forebrain before, during, and after song learning. PNAS. 1988;85:8722–8726. - PMC - PubMed
    1. Andalman AS, Fee MS. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. PNAS. 2009;106:12518–12523. - PMC - PubMed

Publication types