Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(4):e33612.
doi: 10.1371/journal.pone.0033612. Epub 2012 Apr 10.

Optogenetic mimicry of the transient activation of dopamine neurons by natural reward is sufficient for operant reinforcement

Affiliations

Optogenetic mimicry of the transient activation of dopamine neurons by natural reward is sufficient for operant reinforcement

Kyung Man Kim et al. PLoS One. 2012.

Abstract

Activation of dopamine receptors in forebrain regions, for minutes or longer, is known to be sufficient for positive reinforcement of stimuli and actions. However, the firing rate of dopamine neurons is increased for only about 200 milliseconds following natural reward events that are better than expected, a response which has been described as a "reward prediction error" (RPE). Although RPE drives reinforcement learning (RL) in computational models, it has not been possible to directly test whether the transient dopamine signal actually drives RL. Here we have performed optical stimulation of genetically targeted ventral tegmental area (VTA) dopamine neurons expressing Channelrhodopsin-2 (ChR2) in mice. We mimicked the transient activation of dopamine neurons that occurs in response to natural reward by applying a light pulse of 200 ms in VTA. When a single light pulse followed each self-initiated nose poke, it was sufficient in itself to cause operant reinforcement. Furthermore, when optical stimulation was delivered in separate sessions according to a predetermined pattern, it increased locomotion and contralateral rotations, behaviors that are known to result from activation of dopamine neurons. All three of the optically induced operant and locomotor behaviors were tightly correlated with the number of VTA dopamine neurons that expressed ChR2, providing additional evidence that the behavioral responses were caused by activation of dopamine neurons. These results provide strong evidence that the transient activation of dopamine neurons provides a functional reward signal that drives learning, in support of RL theories of dopamine function.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Expression of ChR2 in VTA dopamine neurons.
A, Positions of viral injections in a coronal section of ventral midbrain (−3.28 mm from bregma). All injections were found to be within 0.06 mm rostral or caudal of this section. Each dot represents the injection site for an individual mouse (red circles for ChR2-H mice, black circles for ChR2-L mice, and blue triangles for AAV- mice). Vertical scale bar at right: 0.5 mm. B, Top, ChR2-tdTomato (red) colocalized with TH immunostaining (green) as shown in the overlay at right (yellow). Bottom, ChR2-tdTomato expression was not observed in another mouse. The center of these images corresponds to the ‘X’ marks in ‘A.’ Inset scale bars: 0.15 mm. C, Number of TH+ and ChR2+ neurons in each mouse. Based upon these results, mice were categorized as “ChR2-H” (red) or “ChR2-L” (black).
Figure 2
Figure 2. Operant responding for 200 ms of optical stimulation.
A, Responses on the active (thick line) and inactive port (thin line) for each of the 4 ChR2-H mice, as well as 2 additional mice for which no histology results were obtained (magenta and cyan), over 9 days of acquisition followed by 9 days of extinction. On day 7, the mouse represented in magenta made 236 nose pokes at the active port (not shown). The thick gray lines, in ‘A’ and ‘B,’ represent the mean number of responses across the 4 ChR2-H mice. B, Break points on a PRS. Break points for the inactive port were zero in all cases and are not shown. C, Average responses (mean ± s.e.m.) on the active (thick line) and inactive (thin line) ports for ChR2-L (n = 4) and AAV- (n = 7) mice.
Figure 3
Figure 3. Rasters of response times during the operant task (90 minute sessions over 18 days in each of the 6 mice that displayed high levels of operant responding).
This is the same data summarized in Fig. 2A,B. Responses that were followed by optical stimulation are in black, and those not followed by optical stimulation are in red. Blue horizontal lines divide the acquisition and extinction periods. The two mice shown at the bottom did not undergo the extinction phase, and no histology was performed in these mice.
Figure 4
Figure 4. Optical stimulation promotes locomotion.
A, Following 20 minutes of habituation, there were alternating periods of 2 minutes with and without stimulation (200 ms pulses at 1 Hz), for a total of 5 periods and 10 minutes of each condition. B, Both head speed and C, number of contralateral (but not ipsilateral) rotations were greater during stimulation (white) than non-stimulation (gray) in ChR2-H mice, but not in ChR2-L or AAV- mice.
Figure 5
Figure 5. Behavioral responses correlate with the number of ChR2 positive neurons.
Each point represents a single mouse. For each behavior, we measured the correlation for all ChR2 mice (long lines, p<0.01 for each behavior), as well as only the ChR2-H or ChR2-L mice (short lines). A, The y-axis indicates the difference in the number of operant responses at the active versus inactive ports (averaged across days 6–9; see Fig. 2A). B, Head speed. C, Rotations. In B and C, the y-axis indicates differences between stimulation and non-stimulation periods (see Fig. 4B,C).

References

    1. Wise RA. Dopamine, learning and motivation. Nat Rev Neurosci. 2004;5:483–494. - PubMed
    1. Wise RA, Bozarth MA. A psychomotor stimulant theory of addiction. Psychol Rev. 1987;94:469–492. - PubMed
    1. Bayer HM, Glimcher PW. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. - PMC - PubMed
    1. Fiorillo CD, Newsome WT, Schultz W. The temporal precision of reward prediction in dopamine neurons. Nat Neurosci. 2008;11:966–973. - PubMed
    1. Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science. 2003;299:1898–1902. - PubMed

Publication types