Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 May 14;34(20):6887-95.
doi: 10.1523/JNEUROSCI.5445-13.2014.

Microstimulation of the human substantia nigra alters reinforcement learning

Affiliations

Microstimulation of the human substantia nigra alters reinforcement learning

Ashwin G Ramayya et al. J Neurosci. .

Abstract

Animal studies have shown that substantia nigra (SN) dopaminergic (DA) neurons strengthen action-reward associations during reinforcement learning, but their role in human learning is not known. Here, we applied microstimulation in the SN of 11 patients undergoing deep brain stimulation surgery for the treatment of Parkinson's disease as they performed a two-alternative probability learning task in which rewards were contingent on stimuli, rather than actions. Subjects demonstrated decreased learning from reward trials that were accompanied by phasic SN microstimulation compared with reward trials without stimulation. Subjects who showed large decreases in learning also showed an increased bias toward repeating actions after stimulation trials; therefore, stimulation may have decreased learning by strengthening action-reward associations rather than stimulus-reward associations. Our findings build on previous studies implicating SN DA neurons in preferentially strengthening action-reward associations during reinforcement learning.

Keywords: Parkinson's disease; dopamine; human; microstimulation; reinforcement learning; substantia nigra.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A, Intraoperative targeting of SN. During DBS surgery, a microelectrode is advanced into the SN to map the ventral border of the STN. An example preoperative MRI scan (sagittal view) overlaid with a standard brain atlas and estimated microelectrode position is shown (Jaggi et al., 2004; Zaghloul et al., 2009). B, Reinforcement learning task. During surgery, 11 subjects performed a two-alternative probability learning task with inconsistent stimulus-response mapping. C, Experimental design. During each stage of the session (50 trials each), subjects sampled reward probabilities of two item pairs that were matched in relative reward rate. Each pair of colored rectangles depicts an item pair (the green and red shading within each rectangle indicates the probability of positive and negative feedback associated a particular item in the pair, respectively). During Stage 1, we obtained microelectrode recordings from the SN. An example 500 ms high-pass-filtered (>300 Hz) voltage trace is shown. During Stages 2 and 3, we applied electrical microstimulation through the recording microelectrode as depicted, but no longer obtained recordings (see Materials and Methods).
Figure 2.
Figure 2.
Effects of stimulation on learning. To index learning performance on a particular item pair, we computed the probability that subjects chose the item that was associated with a high reward probability (“accuracy”). During Stage 2, subjects demonstrated lower accuracy on the STIM+ pair compared with the SHAM pair. During Stage 3, we did not identify changes in accuracy between the STIM and SHAM pairs. *p < 0.05. Error bars indicate SEM across subjects (n = 11).
Figure 3.
Figure 3.
A, Relation between decreases in learning and action bias. Stimulation-related decreases in accuracy were positively correlated with an increased bias toward repeating a button press after reward trials (win-same button; Pearson's r = 0.77, p = 0.006). Each dot represents a subject, the solid black line is the regression slope, and the dashed lines represent 95% confidence intervals. B, C, The Q learning model is insufficient to explain stimulation-related behavioral changes. Simulated behavior of a standard two-parameter reinforcement learning algorithm (Q model) on a two-alternative probability learning task with inconsistent stimulus-response mapping. Accuracy (light gray line), probability of repeating rewarded items (win-stay, dark gray line), and probability of repeating rewarded actions (win-same button, black line) are shown for decreasing learning rates (α; B) and increasing noise in the choice policy (β; C). Decreases in learning rate and increases in decision noise were accompanied by a decrease in accuracy and a decrease in win-stay, but no change in win-same button.
Figure 4.
Figure 4.
A, Hybrid AQ learning model. Shown is the simulated behavior of the three-parameter reinforcement learning algorithm (hybrid AQ model) on a two-alternative probability learning task with inconsistent stimulus-response mapping. Accuracy (light gray line), probability of repeating rewarded items (win-stay, dark gray line), and probability of repeating rewarded actions (win-same button, black line) are shown for varying values of the action value weighting parameter (WA). Strengthened action–reward associations were associated with decreases in accuracy and win-stay and increases in win-same button. B, Stimulation-related behavioral changes can be explained by strengthened action–reward associations. We quantitatively fit the hybrid AQ model to subjects' behavior on the STIM+ and SHAM pair during Stage 2. We found that stimulation-related decreases in accuracy showed a significant positive relation with increases in WA, but not α or β. See main text for statistics.
Figure 5.
Figure 5.
Win-same button during congruent and incongruent trials. A, Subjects who showed stimulation-related increases in win-same button (n = 5) showed asymmetric changes during congruent (gray) and incongruent (black) trials when comparing STIM+ and SHAM trials. B, C, Simulated behavior of a Q learning model shows symmetric changes in win-same button during congruent and incongruent trials. D, Strengthened action–reward associations in the hybrid AQ learning model results in asymmetric changes in win-same button.
Figure 6.
Figure 6.
Relation between stimulation-related action bias and recorded neural activity. A, B, Stimulation-related increases in win-same button were positively correlated with postreward phasic responses (A) and the mean waveform duration (B) of multiunit activity recorded during Stage 1. Each dot represents a subject, the solid black line is the regression slope, and the dashed lines represent 95% confidence intervals. Nine of the 11 subjects contributed to this analysis (we were unable to obtain recordings from Subject #3 and we did not identify spiking activity from Subject #11; see Materials and Methods). C, Example waveforms and postreward phasic responses of unit activity from the two subjects who showed the greatest increases in win-same button (outlined in black in A and B). For each unit, we show the average waveform (top left, gray shading marks the SD), the interspike interval (bottom left, black line marks 3 ms), the average postreward firing response (top right, smoothed with a Gaussian kernel of half-width = 75 ms; gray shading indicates SE of mean), and the spike raster after reward trials. Dashed black line indicates reward onset.

Comment in

Similar articles

Cited by

References

    1. Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control. 1974;19:716–723. doi: 10.1109/TAC.1974.1100705. - DOI
    1. Bayer HM, Glimcher PW. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. doi: 10.1016/j.neuron.2005.05.020. - DOI - PMC - PubMed
    1. Bayer HM, Lau B, Glimcher PW. Statistics of midbrain dopaminergic neuron spike trains in the awake primate. J Neurophysiol. 2007;98:1428–1439. doi: 10.1152/jn.01140.2006. - DOI - PubMed
    1. Clark KL, Armstrong KM, Moore T. Probing neural circuitry and function with electrical microstimulation. Proc Biol Sci. 2011;278:1121–1130. doi: 10.1098/rspb.2010.2211. - DOI - PMC - PubMed
    1. Cools R, Barker RA, Sahakian BJ, Robbins TW. Enhanced or impaired cognitive function in Parkinson's disease as a function of dopaminergic medication and task demands. Cereb Cortex. 2001;11:1136–1143. doi: 10.1093/cercor/11.12.1136. - DOI - PubMed

Publication types

LinkOut - more resources