Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Apr 21;106(16):6826-31.
doi: 10.1073/pnas.0901835106. Epub 2009 Apr 3.

Learning reward timing in cortex through reward dependent expression of synaptic plasticity

Affiliations

Learning reward timing in cortex through reward dependent expression of synaptic plasticity

Jeffrey P Gavornik et al. Proc Natl Acad Sci U S A. .

Abstract

The ability to represent time is an essential component of cognition but its neural basis is unknown. Although extensively studied both behaviorally and electrophysiologically, a general theoretical framework describing the elementary neural mechanisms used by the brain to learn temporal representations is lacking. It is commonly believed that the underlying cellular mechanisms reside in high order cortical regions but recent studies show sustained neural activity in primary sensory cortices that can represent the timing of expected reward. Here, we show that local cortical networks can learn temporal representations through a simple framework predicated on reward dependent expression of synaptic plasticity. We assert that temporal representations are stored in the lateral synaptic connections between neurons and demonstrate that reward-modulated plasticity is sufficient to learn these representations. We implement our model numerically to explain reward-time learning in the primary visual cortex (V1), demonstrate experimental support, and suggest additional experimentally verifiable predictions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Schematic illustration summarizing key features from experimental results. Plots show the firing frequency response of a right-eye (RE) dominant neuron to RE stimulation and a left-eye (LE) dominant neuron to LE stimulation. In the naı̄ve animal, both LE and RE neurons respond (gray lines) only during the period of stimulation (shaded box). During training, LE and RE stimulations are paired with rewards delivered after a short (ST) or long (LT) delay period (dashed lines). After training, neuronal responses (black lines) persist until the reward times paired with each stimulus. See Fig. S1 for examples of real neural activity.
Fig. 2.
Fig. 2.
Representing time. (A) Network structure. Neurons in the recurrent layer are shown with a subset of the recurrent excitatory synaptic connections (curved black arrows) and labeled with their associated activity levels (Vi). Actual network connectivity is all-to-all. Each neuron also receives external input from the left or right eye. The reward signal (R) projects to all neurons in the recurrent layer. (B) Example neural activity profiles from a single rate-based neuron after stimulation (shaded box) when isolated (black line, no recurrent stimulation, decay rate set by τm) and when embedded in a network with small lateral synaptic weights (gray line, decay rate τdμ). In our model, encoded time is represented by the decay rate of neural activity in the recurrently connected layer.
Fig. 3.
Fig. 3.
RDE in a network of passive integrator neurons. In the naı̄ve network (left column), monocular stimulation of either the left (LE, stimulation of units 1–20) or right eye (RE, stimulation of units 21–40) elicits a brief period of activity (V, with values indicated by colorbar) that decays rapidly after the end of stimulation (green bar). There is no activity at the time of reward (cyan lines) for either input pattern. In the trained network (right column), stimulus-evoked activity decays with a time constant associated with the appropriate reward time. Plotting V (normalized, Insets) for example neurons (unit number 5 for LE, 25 for RE) in naı̄ve (blue) and trained networks (red) shows that training increases the effective decay time constant.
Fig. 4.
Fig. 4.
RDE with stochastic spiking neurons. Each subplot shows a raster plot for all neurons in the network over the course of a single stimulus evoked response (Upper) and the resultant spike histogram (Lower) indicating the average firing frequency in Hz for the whole network. (A and B) The 2 monocular stimulus patterns elicit brief periods of activity during stimulation (gray bar) in responsive subpopulations of the naı̄ve network that decay before the times of reward (dashed lines) for each input. (C and D) After training with RDE, evoked activity persists until the appropriate reward times.
Fig. 5.
Fig. 5.
Experimental results demonstrating that training increases spontaneous firing rates and evoked responses as predicted by the model. Training evokes a 40% increase in the spontaneous firing rate and a 74% increase in evoked response. Error bars show standard deviation. Differences between naı̄ve and trained responses are statistically significant for both metrics (spontaneous P = 0.01, evoked P = 1.5 × 10−4).

References

    1. Mauk MD, Buonomano DV. The neural basis of temporal processing. Annu Rev Neurosci. 2004;27:307–340. - PubMed
    1. Lewis PA, Miall RC. Remembering the time: A continuous clock. Trends Cogn Sci. 2006;10:401–406. - PubMed
    1. Staddon JE. Interval timing: Memory, not a clock. Trends Cogn Sci. 2005;9:312–314. - PubMed
    1. Meck WH. Neuropsychology of timing and time perception. Brain Cogn. 2005;58:1–8. - PubMed
    1. Rammsayer TH. Neuropharmacological evidence for different timing mechanisms in humans. Q J Exp Psychol B. 1999;52:273–286. - PubMed

Publication types

LinkOut - more resources