Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Clinical Trial
. 2007 Oct 1;62(7):756-64.
doi: 10.1016/j.biopsych.2006.09.042. Epub 2007 Feb 14.

Selective reinforcement learning deficits in schizophrenia support predictions from computational models of striatal-cortical dysfunction

Affiliations
Clinical Trial

Selective reinforcement learning deficits in schizophrenia support predictions from computational models of striatal-cortical dysfunction

James A Waltz et al. Biol Psychiatry. .

Abstract

Background: Rewards and punishments may make distinct contributions to learning via separate striatal-cortical pathways. We investigated whether fronto-striatal dysfunction in schizophrenia (SZ) is characterized by selective impairment in either reward- (Go) or punishment-driven (NoGo) learning.

Methods: We administered two versions of a probabilistic selection task to 40 schizophrenia patients and 31 control subjects, using difficult to verbalize stimuli (experiment 1) and nameable objects (experiment 2). In an acquisition phase, participants learned to choose between three different stimulus pairs (AB, CD, EF) presented in random order, based on probabilistic feedback (80%, 70%, 60%). We used analyses of variance (ANOVAs) to assess the effects of group and reinforcement probability on two measures of contingency learning. To characterize the preference of subjects for choosing the most rewarded stimulus and avoiding the most punished stimulus, we subsequently tested participants with novel pairs of stimuli involving either A or B, providing no feedback.

Results: Control subjects demonstrated superior performance during the first 40 acquisition trials in each of the 80% and 70% conditions versus the 60% condition; patients showed similarly impaired (<60%) performance in all three conditions. In novel test pairs, patients showed decreased preference for the most rewarded stimulus (A; t = 2.674; p = .01). Patients were unimpaired at avoiding the most negative stimulus (B; t = .737).

Conclusions: The results of these experiments provide additional evidence for the presence of deficits in reinforcement learning in SZ, suggesting that reward-driven learning may be more profoundly impaired than punishment-driven learning.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The Probabilistic Stimulus Selection (PSS) Task. The task consists of two phases: During an “acquisition phase”, subjects are presented with three training pairs, and instructed to identify which stimulus from each pair is more frequently reinforced. In AB trials, for example, a choice of stimulus A leads to positive feedback in 80% of trials, whereas a B choice is reinforced on the remaining 20%. Learning the most-frequently rewarded stimulus in each pair can be accomplished either by learning that one of the stimuli leads to positive feedback, or that the other leads to negative feedback (or both). Subjects are told to choose that stimulus as often as possible. Once subjects reach criterion on all three training pairs, or complete 360 total trials, they proceed to a “post-acquisition test phase,” during which they are presented with four trials each of the three training pairs, along with 12 new pairs created from all unused combinations of the training stimuli. The eight new stimulus pairs involving A and B are called the “transfer pairs” and used to gauge “Go” and “NoGo” learning. If positive feedback was more effective, they should reliably choose stimulus A in all novel test pairs in which it is present; if they learned more from negative feedback, they should avoid stimulus B.
Fig. 2
Fig. 2
The cortico-striato-thalamo-cortical loops, including the direct and indirect pathways of the basal ganglia. The cells of the striatum are divided into two sub-classes based on differences in biochemistry and efferent projections. The “Go” cells project directly to the GPi/SNr, and their activity disinhibits the thalamus, thereby facilitating the execution of a cortical response. The “NoGo” cells are part of the indirect pathway to the GPi/SNr, and have an opposing effect, suppressing actions from getting executed. Dopamine from the SNc projects to the dorsal striatum, differentially modulating activity in the direct and indirect pathways by activating different receptors: The Go cells express the D1 receptor, and the NoGo cells express the D2 receptor. The orbitofrontal cortex is thought to maintain reinforcement-related information in working memory and provide top-down biasing on the more primitive BG system, in addition to direct influencing of response selection processes in premotor cortex. The OFC receives information about relative magnitude of reinforcement values from the ABL, which it can also maintain in working memory. Dopamine from the VTA projects to ventral striatum (not shown) and orbitofrontal cortex. GPi: internal segment of globus pallidus; GPe: external segment of globus pallidus; SNc: substantia nigra pars compacta; SNr: substantia nigra pars reticulata; VTA: ventral tegmental area; ABL: basolateral amygdala.
Fig. 3
Fig. 3
Acquisition of probabilistic contingencies by patients (SZs) and controls (NCs) in Experiment 2. (A) In blocks 1 and 2. (B) Performance on training pairs at post-acquisition test. The proportion of correct responses was defined as the proportion of trails on which the most-frequently reinforced stimulus was chosen. In both panels, black bars = control subjects, white bars = patients.
Fig. 4
Fig. 4
Performance of subjects on two measures of feedback-driven learning from Experiment 2. In both plots, black bars = control subjects, white bars = patients. (A) Impact of trial-by-trial task feedback on subsequent choices in a given condition in first acquisition block (20 trials in each stimulus condition). “Win-stay” scores reflect the proportion of repeated stimulus selections in a given condition following reinforced choices. “Lose-shift” scores reflect the proportion of switched stimulus selections in a given condition following non-reinforced choices. Total “win-stay” and “lose-shift” scores were generated by averaging scores across conditions for each. (B) Performance 24 controls and 32 patients qualified for transfer analysis in the post-acquisition test phase. This analysis only included subjects who demonstrated acquisition of the 80:20 contingency by choosing A on at least 75% of AB test trials, and thus, the groups showed similar performance on the AB (80:20) test pair. “Go” learning was assessed using novel pairs involving the 80%-reinforced stimulus (Choose A v. Novel), as choosing A depends on having learned from positive feedback. “NoGo” learning was assessed using novel pairs involving the 20%-reinforced stimulus (Avoid B v. Novel), as avoiding B depends on having learned from negative feedback.

Comment in

Similar articles

Cited by

References

    1. Abi-Dargham A, Gil R, Krystal J, Baldwin RM, Seibyl JP, Bowers M, et al. Increased striatal dopamine transmission in schizophrenia: confirmation in a second cohort. Am J Psychiatry. 1998;155:761–767. - PubMed
    1. Addington D, Addington J, Maticka-Tyndale E, Joyce J. Reliability and validity of a depression rating scale for schizophrenics. Schizophr Res. 1992;6:201–208. - PubMed
    1. Amtage J, Schmidt WJ. Context-dependent catalepsy intensification is due to classical conditioning and sensitization. Behav Pharmacol. 2003;14:563–567. - PubMed
    1. Andreasen NC. The Scale for the Assessment of Negative Symptoms (SANS) University of Iowa; Iowa City, IA: 1984.
    1. Aubert I, Ghorayeb I, Normand E, Bloch B. Phenotypical characterization of the neurons expressing the D1 and D2 dopamine receptors in the monkey striatum. J Comp Neurol. 2000;418:22–32. - PubMed

Publication types

MeSH terms

Substances