Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Randomized Controlled Trial
. 2012 Apr;35(7):1011-23.
doi: 10.1111/j.1460-9568.2011.07920.x.

Dissociating hippocampal and striatal contributions to sequential prediction learning

Affiliations
Randomized Controlled Trial

Dissociating hippocampal and striatal contributions to sequential prediction learning

Aaron M Bornstein et al. Eur J Neurosci. 2012 Apr.

Abstract

Behavior may be generated on the basis of many different kinds of learned contingencies. For instance, responses could be guided by the direct association between a stimulus and response, or by sequential stimulus-stimulus relationships (as in model-based reinforcement learning or goal-directed actions). However, the neural architecture underlying sequential predictive learning is not well understood, in part because it is difficult to isolate its effect on choice behavior. To track such learning more directly, we examined reaction times (RTs) in a probabilistic sequential picture identification task in healthy individuals. We used computational learning models to isolate trial-by-trial effects of two distinct learning processes in behavior, and used these as signatures to analyse the separate neural substrates of each process. RTs were best explained via the combination of two delta rule learning processes with different learning rates. To examine neural manifestations of these learning processes, we used functional magnetic resonance imaging to seek correlates of time-series related to expectancy or surprise. We observed such correlates in two regions, hippocampus and striatum. By estimating the learning rates best explaining each signal, we verified that they were uniquely associated with one of the two distinct processes identified behaviorally. These differential correlates suggest that complementary anticipatory functions drive each region's effect on behavior. Our results provide novel insights as to the quantitative computational distinctions between medial temporal and basal ganglia learning networks and enable experiments that exploit trial-by-trial measurement of the unique contributions of both hippocampus and striatum to response behavior.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Task design. (A) Training. Participants were first trained to deterministically associate each of four buttons with one of the stimulus images. Training proceeded until participants reached a fixed accuracy criterion. The associations between stimuli and responses were not varied during the course of the task. (B) Test. Images were presented one at a time for a fixed 2000ms, regardless of the keypress response. At the first correct keypress, a gray bounding box appears around the image and remains displayed for 300ms, or until the end of the fixed trial time, whichever is lesser. Reaction time was recorded to the first keypress. (C) Transition structure. Successive images were chosen according to a first-order transition structure, the existence of which was not instructed to the participants. This structure changed abruptly at two points during the task, unaligned to rest periods and with no visual or other notification.
Figure 1
Figure 1
Task design. (A) Training. Participants were first trained to deterministically associate each of four buttons with one of the stimulus images. Training proceeded until participants reached a fixed accuracy criterion. The associations between stimuli and responses were not varied during the course of the task. (B) Test. Images were presented one at a time for a fixed 2000ms, regardless of the keypress response. At the first correct keypress, a gray bounding box appears around the image and remains displayed for 300ms, or until the end of the fixed trial time, whichever is lesser. Reaction time was recorded to the first keypress. (C) Transition structure. Successive images were chosen according to a first-order transition structure, the existence of which was not instructed to the participants. This structure changed abruptly at two points during the task, unaligned to rest periods and with no visual or other notification.
Figure 1
Figure 1
Task design. (A) Training. Participants were first trained to deterministically associate each of four buttons with one of the stimulus images. Training proceeded until participants reached a fixed accuracy criterion. The associations between stimuli and responses were not varied during the course of the task. (B) Test. Images were presented one at a time for a fixed 2000ms, regardless of the keypress response. At the first correct keypress, a gray bounding box appears around the image and remains displayed for 300ms, or until the end of the fixed trial time, whichever is lesser. Reaction time was recorded to the first keypress. (C) Transition structure. Successive images were chosen according to a first-order transition structure, the existence of which was not instructed to the participants. This structure changed abruptly at two points during the task, unaligned to rest periods and with no visual or other notification.
Figure 1
Figure 1
Task design. (A) Training. Participants were first trained to deterministically associate each of four buttons with one of the stimulus images. Training proceeded until participants reached a fixed accuracy criterion. The associations between stimuli and responses were not varied during the course of the task. (B) Test. Images were presented one at a time for a fixed 2000ms, regardless of the keypress response. At the first correct keypress, a gray bounding box appears around the image and remains displayed for 300ms, or until the end of the fixed trial time, whichever is lesser. Reaction time was recorded to the first keypress. (C) Transition structure. Successive images were chosen according to a first-order transition structure, the existence of which was not instructed to the participants. This structure changed abruptly at two points during the task, unaligned to rest periods and with no visual or other notification.
Figure 2
Figure 2
Sequential learning. (A) Despite the fact that they were unaware of task structure, participant reaction times reflected the probabilities as designed — response time was commensurately lower as conditional likelihood of the image increased. (B) An analysis of the influence of prior responses on reaction time on the current trial shows a decaying effect of previous experience, with significant contributions from the seven most recent presentations of the current image. Reaction time for a given image-image transition was lowered by more recent experience with that transition; this effect showed an exponential relationship between recency of experience and reaction time. This pattern excludes models that do not incorporate forgetting of past experience. □ = p < 0.05. Error bars are SEM (C) Model comparison. Individual log Bayes factors in favor of a model using two learning processes, versus a single process. The two-process model is decisively favored for 14 / 18 subjects, and was a significantly better fit across the population (summed log Bayes factor 145, p < 5e−5 by likelihood ratio test).
Figure 2
Figure 2
Sequential learning. (A) Despite the fact that they were unaware of task structure, participant reaction times reflected the probabilities as designed — response time was commensurately lower as conditional likelihood of the image increased. (B) An analysis of the influence of prior responses on reaction time on the current trial shows a decaying effect of previous experience, with significant contributions from the seven most recent presentations of the current image. Reaction time for a given image-image transition was lowered by more recent experience with that transition; this effect showed an exponential relationship between recency of experience and reaction time. This pattern excludes models that do not incorporate forgetting of past experience. □ = p < 0.05. Error bars are SEM (C) Model comparison. Individual log Bayes factors in favor of a model using two learning processes, versus a single process. The two-process model is decisively favored for 14 / 18 subjects, and was a significantly better fit across the population (summed log Bayes factor 145, p < 5e−5 by likelihood ratio test).
Figure 2
Figure 2
Sequential learning. (A) Despite the fact that they were unaware of task structure, participant reaction times reflected the probabilities as designed — response time was commensurately lower as conditional likelihood of the image increased. (B) An analysis of the influence of prior responses on reaction time on the current trial shows a decaying effect of previous experience, with significant contributions from the seven most recent presentations of the current image. Reaction time for a given image-image transition was lowered by more recent experience with that transition; this effect showed an exponential relationship between recency of experience and reaction time. This pattern excludes models that do not incorporate forgetting of past experience. □ = p < 0.05. Error bars are SEM (C) Model comparison. Individual log Bayes factors in favor of a model using two learning processes, versus a single process. The two-process model is decisively favored for 14 / 18 subjects, and was a significantly better fit across the population (summed log Bayes factor 145, p < 5e−5 by likelihood ratio test).
Figure 3
Figure 3
Areas where the BOLD signal correlated with the entropy over the distribution of upcoming stimuli, generated at each of our analyzed learning rates. Images are thresholded at p < 0.001, uncorrected, for display purposes. Row A shows activation observed in the fast process GLM, with clusters of negative correlation in ventral striatum and anterior insula. Row B shows activation observed in the slow process GLM, a positively correlated cluster in anterior hippocampus. The activation visible in posterior parahippocampal cortex did not survive correction for multiple comparisons.
Figure 4
Figure 4
Areas where the BOLD signal correlated with the conditional probability of the current stimulus, generated at each of our analyzed learning rates. Images are thresholded at p < 0.001, uncorrected, for display purposes. Using the fast process GLM, significant activation was observed in ventral striatum. No clusters significantly correlated with conditional probability were observed in the slow process GLM.
Figure 5
Figure 5
Comparison of learning rates implied by activity in our primary regions of interest. These values were computed by first identifying voxels in our a priori regions of interest (hippocampus and ventral striatum) which were maximally responsive to to our model regressors (probability and entropy) when generated at the midpoint of our behaviorally-obtained learning rates (black dotted line), then estimating best-fitting learning rates by deviations from this baseline (see Supplementary Methods). Bars represent the average implied learning rate across subjects, at a single voxel for each combination of region and regressor: left hippocampus (−26, −14, 24) and right ventral striatum (entropy 18, 16, −6 ; probability 20, 6, −2). Error bars represent the positive and negative confidence intervals, across subjects.

References

    1. Bahrick HP. Incidental learning under two incentive conditions. Journal of Experimental Psychology. 1954;47(3):170–172. - PubMed
    1. Bakker A, Kirwan CB, Miller M, Stark CEL. Pattern separation in the human hippocampal CA3 and dentate gyrus. Science. 2008;319(5870):1640–1642. - PMC - PubMed
    1. Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37(4):407–419. - PubMed
    1. Balleine BW, Daw ND, O'Doherty JP. Multiple forms of value learning and the function of dopamine. In: Glimcher PW, Camerer C, Poldrack RA, Fehr E, editors. Neuroeconomics: Decision Making and the Brain. Academic Press; 2008.
    1. Bar M. Visual Objects in Context. Nature Reviews Neuroscience. 2004;5(8):617–629. - PubMed

Publication types