Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun 27:6:38.
doi: 10.3389/fncir.2012.00038. eCollection 2012.

Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions

Affiliations

Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions

Michale S Fee. Front Neural Circuits. .

Abstract

In its simplest formulation, reinforcement learning is based on the idea that if an action taken in a particular context is followed by a favorable outcome, then, in the same context, the tendency to produce that action should be strengthened, or reinforced. While reinforcement learning forms the basis of many current theories of basal ganglia (BG) function, these models do not incorporate distinct computational roles for signals that convey context, and those that convey what action an animal takes. Recent experiments in the songbird suggest that vocal-related BG circuitry receives two functionally distinct excitatory inputs. One input is from a cortical region that carries context information about the current "time" in the motor sequence. The other is an efference copy of motor commands from a separate cortical brain region that generates vocal variability during learning. Based on these findings, I propose here a general model of vertebrate BG function that combines context information with a distinct motor efference copy signal. The signals are integrated by a learning rule in which efference copy inputs gate the potentiation of context inputs (but not efference copy inputs) onto medium spiny neurons in response to a rewarded action. The hypothesis is described in terms of a circuit that implements the learning of visually guided saccades. The model makes testable predictions about the anatomical and functional properties of hypothesized context and efference copy inputs to the striatum from both thalamic and cortical sources.

Keywords: context; corticostriatal; efference copy; motor learning; songbird; striatum; thalamostriatal.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The basal ganglia can drive learned changes in visually- guided saccades. (A) Schematic diagram of the direct pathway of an oculomotor circuit in the BG. The output of the BG can be thought of as having discrete motor “channels.” Shown are two channels that project to the superior colliculus and can drive saccades to the left or right. In this simple model, these channels can be driven by sensory inputs from cortex, illustrated here by neurons responding to the appearance of visual targets 1 and 2. (B) Neurons in the substantia nigra pars reticulate (SNr) are tonically active and inhibit the generation of saccades by the superior colliculus. SNr neurons can be inhibited by spiking in medium spiny neurons (MSNs) in the striatum, thus releasing the superior colliculus from inhibition (adapted from Hikosaka et al., 2000). (C) Illustration of a stimulus-response task in which only saccades in one direction (e.g., leftward saccade) are rewarded while saccades in the other direction were not rewarded. (D) During training, saccades in the rewarded direction become faster are generated with a shorter latency than saccades in the unrewarded direction. This behavioral change is thought to be mediated by activation of MSNs in the rewarded “channel” by the appropriate cortical inputs. Images in panels (C,D) are taken from Lauwereyns et al. (2002).
Figure 2
Figure 2
A model of vocal learning in the songbird. (A) Schematic diagram of nuclei involved in song production and song learning. (B) Hypothesized homology between songbird and mammalian brain areas (MC, motor cortex). (C) Firing patterns of a single corticostriatal neuron in LMAN during singing. Each row of the raster plot shows the spikes produced during a different rendition of the song (spectrogram shown at top). The high degree of variability in LMAN activity is thought to drive exploratory song variations during learning. (D) Firing patterns of seven different corticostriatal neurons in HVC during singing. Raster plot shows spike produced during 10 sequential song renditions for each neuron. Note the highly stereotyped and sparse burst pattern of each neuron. The spiking of MSNs shows a similar degree of sparseness, potentially allowing MSN to compute the value of LMAN fluctuations independently at each time in the song. (E) A simple hypothesized circuit for song learning. LMAN and HVC inputs converge, together with dopaminergic inputs from the VTA, onto a single medium spiny neuron (MSN) in Area X. The LMAN input to the MSN (hollow circle) arises as an axon collateral of the projection of LMAN to the motor pathway (RA). This efference copy input does not drive spiking in the MSN, but gates synaptic plasticity at the HVC input (filled circle). If LMAN activity is coincident with the HVC input, and leads to improved song performance (signaled by increased dopamine input), then the HVC-MSN synapse is strengthened. On future song renditions, the HVC input drives the MSN to spike, thus disinhibiting the thalamus and biasing the LMAN neuron to be more active at that time. (F) Schematic showing the closed topographical loops between LMAN, Area X and the thalamic nucleus DLM (Luo et al., 2001). This allows Area X to independently evaluate and bias activity in each different subregion of LMAN. Also shown are the hypothesized divergent inputs from HVC.
Figure 3
Figure 3
A model of oculomotor learning in the BG incorporating an efference copy signal. (A) “Random” saccades during learning are generated in cortical frontal eye fields (dark yellow, FEF). Efference copy inputs to the MSN (hollow circle, analogous to LMAN inputs to Area X) arise from a collateral of the descending motor commands from the FEF to the superior colliculus (SC). Context inputs to the MSN (filled circles, analogous to HVC inputs to Area X) arise from cortical neurons conveying sensory inputs. The output of the SNr biases saccade generation by a projection to intermediate “motor” layers of SC. (B) The hypothesized learning rule that incorporates efferency copy, context, and reward signals. Coincident activation of context (CX) and efference copy (EC) inputs activates a transient eligibility trace (Etrace). If a reward signal (Reward) coincides with the eligibility trace, then the CX input is strengthened (ΔWCX-MSN > 0). (C) Hypothesized sequence of events during learning. (1) Cortical neuron Ctx-1 becomes active indicating the appearance of a particular target (i.e., Target 1). (2) The FEF generates a “random guess” at a saccade direction, in this case, to the left. This combination activates an eligibility trace in the Ctx-1 to MSN-L synapse. (3) If leftward saccades are rewarded in response to Target 1, monkey receives a reward, resulting in increased spiking in dopaminergic VTA neurons. (4) The coincidence of the reward and eligibility trace results in strengthening of the Ctx-1 to MSN-L synapse. Thus, future appearances of Target-1 will bias the monkey to make a leftward saccade.
Figure 4
Figure 4
Three additional models of BG circuits incorporating efference copy. (A) Efference copy comes from the FEF, as in Figure 3A, but the output of the BG acts to bias saccade generation in the FEF through the pallido-thalamo-cortical loop, rather than acting directly on the SC. (B) A model in which “random” saccades may be driven by any input to the SC, and efference copy signals to the BG arise from ascending tectothalamic and thalamostriatal pathways. In this model, the BG biases saccade generation by acting directly on the SC. (C) A model in which both efference copy inputs and context inputs arise from thalamostriatal pathways. This model is hypothesized to represent an evolutionarily early role for the BG in controlling brainstem-generated behavior.

References

    1. Abramson B. P., Chalupa L. M. (1988). Multiple pathways from the superior colliculus to the extrageniculate visual thalamus of the cat. J. Comp. Neurol. 271, 397–418 10.1002/cne.902710308 - DOI - PubMed
    1. Alexander G. E., Crutcher M. D. (1990). Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13, 266–271 - PubMed
    1. Alexander G. E., DeLong M. R., Strick P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 10.1146/annurev.ne.09.030186.002041 - DOI - PubMed
    1. Andalman A. S., Fee M. S. (2009). A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc. Natl. Acad. Sci. U.S.A. 106, 12518–12523 10.1073/pnas.0903214106 - DOI - PMC - PubMed
    1. Aronov D., Andalman A. S., Fee M. S. (2008). A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320, 630–634 10.1126/science.1155140 - DOI - PubMed

LinkOut - more resources