Prediction-error-dependent processing of immediate and delayed positive feedback

Constanze Weber¹, Christian Bellebaum²

Affiliations

¹ Faculty of Mathematics and Natural Sciences, Institute of Experimental Psychology, Department of Biological Psychology, Heinrich Heine University Düsseldorf, Universitätstraße 1, 40255, Düsseldorf, Germany. Constanze.Weber@hhu.de.
² Faculty of Mathematics and Natural Sciences, Institute of Experimental Psychology, Department of Biological Psychology, Heinrich Heine University Düsseldorf, Universitätstraße 1, 40255, Düsseldorf, Germany.

PMID: 38678065
PMCID: PMC11055855
DOI: 10.1038/s41598-024-60328-8

Prediction-error-dependent processing of immediate and delayed positive feedback

Constanze Weber et al. Sci Rep. 2024.

. 2024 Apr 27;14(1):9674.

doi: 10.1038/s41598-024-60328-8.

Authors

Constanze Weber¹, Christian Bellebaum²

Affiliations

¹ Faculty of Mathematics and Natural Sciences, Institute of Experimental Psychology, Department of Biological Psychology, Heinrich Heine University Düsseldorf, Universitätstraße 1, 40255, Düsseldorf, Germany. Constanze.Weber@hhu.de.
² Faculty of Mathematics and Natural Sciences, Institute of Experimental Psychology, Department of Biological Psychology, Heinrich Heine University Düsseldorf, Universitätstraße 1, 40255, Düsseldorf, Germany.

PMID: 38678065
PMCID: PMC11055855
DOI: 10.1038/s41598-024-60328-8

Abstract

Learning often involves trial-and-error, i.e. repeating behaviours that lead to desired outcomes, and adjusting behaviour when outcomes do not meet our expectations and thus lead to prediction errors (PEs). PEs have been shown to be reflected in the reward positivity (RewP), an event-related potential (ERP) component between 200 and 350 ms after performance feedback which is linked to striatal processing and assessed via electroencephalography (EEG). Here we show that this is also true for delayed feedback processing, for which a critical role of the hippocampus has been suggested. We found a general reduction of the RewP for delayed feedback, but the PE was similarly reflected in the RewP and the later P300 for immediate and delayed positive feedback, while no effect was found for negative feedback. Our results suggest that, despite processing differences between immediate and delayed feedback, positive PEs drive feedback processing and learning irrespective of delay.

Keywords: FRN; Feedback delay; Prediction error; Reinforcement learning; RewP.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Probabilistic learning task and behavioural parameters. (a) Visual stimuli. One of the two sets of symbols used as visual stimuli with their corresponding reward probabilities in the probabilistic learning task. To enable and compare learning in both feedback timing conditions, each participant underwent the task with a different set of visual stimuli in the immediate and delayed feedback version (counterbalanced between participants). (b) Schematic trial. The time course of events in a learning trial is shown. Participants’ choice of one of the two presented stimuli was indicated by a red circle for 500 ms after their response, followed by a fixation cross for 500 ms in the immediate feedback condition and for 6500 ms in the delayed feedback timing condition. The feedback was then displayed for 500 ms. Intertrial intervals varied between 1200 and 1600 ms. Participants who did not respond within 3000 ms were asked to respond more quickly. (c) Learning performance. Boxplots show averaged choice accuracies across all reward probabilities and participants (N = 20) separately for each block in each feedback timing condition. (d) Estimated learning rates. Boxplots show learning rates averaged across participants (N = 20) which were estimated separately for positive and negative feedback trials and each feedback timing condition. (e) Stimulus value estimates over the course of the experiment, separately for the five stimuli involved and the immediate and delayed feedback timing condition, averaged across participants (N = 20). Dashed lines show the objective reward probabilities for comparison.

**Figure 2**
FRN/RewP quantification and results. (a) Feedback-locked grand-averaged ERPs at the frontocentral electrode cluster separately for the immediate and delayed feedback timing condition. Dashed lines show the time window in which the FRN/RewP was quantified. The peak latency of the difference wave (negative minus positive feedback) on which the time window for amplitude extraction was based is shown with the solid line. (b) Model-estimated marginal effects illustrating the interaction between the fixed effects of Feedback Valence and Feedback Timing, (c) the interaction between the fixed effects of PE and Feedback Timing, and (d) the interaction between the fixed effects of PE and Feedback Valence (regardless of Feedback Timing). In (e), the exploratory follow-up analyses of the interaction between PE and Feedback Valence separately for the immediate and delayed Feedback Timing condition on FRN/RewP amplitudes are illustrated. Error bars and shaded areas represent 95% confidence intervals.

**Figure 3**
P300 results. (a) Feedback-locked grand-averaged ERPs separately for immediate and delayed positive and negative feedback at the frontocentral and parietal electrode cluster. Dashed lines indicate the search time window for quantification. The peak P300 latency, which was the basis for determining the time window for single-trial-amplitude extraction, is shown with the solid line. Note that the peak latency was determined based on the signal average across negative and positive feedback and pooled across the electrodes of both electrode clusters (see Figure S1c in the Supplementary Materials). (b) Model-estimated marginal effects illustrating the interaction between the fixed effects of Feedback Valence, Feedback Timing, and Frontality, and c between Feedback Valence, PE, and Frontality. Error bars and shaded areas represent 95% confidence intervals.

See this image and copyright information in PMC

References

1. Hajcak G, Moser JS, Holroyd CB, Simons RF. It's worse than you thought: The feedback negativity and violations of reward prediction in gambling tasks. Psychophysiology. 2007;44:905–912. doi: 10.1111/j.1469-8986.2007.00567.x. - DOI - PubMed
1. Höltje G, Mecklinger A. Electrophysiological reward signals predict episodic memory for immediate and delayed positive feedback events. Brain Res. 2018;1701:64–74. doi: 10.1016/j.brainres.2018.07.011. - DOI - PubMed
1. Holroyd CB, Larsen JT, Cohen JD. Context dependence of the event-related brain potential associated with reward and punishment. Psychophysiology. 2004;41:245–253. doi: 10.1111/j.1469-8986.2004.00152.x. - DOI - PubMed
1. Delgado MR. Reward-related responses in the human striatum. Ann. N. Y. Acad. Sci. 2007;1104:70–88. doi: 10.1196/annals.1390.002. - DOI - PubMed
1. Walsh MM, Anderson JR. Learning from experience: Event-related potential correlates of reward processing, neural adaptation, and behavioral choice. Neurosci. Biobehav. Rev. 2012;36:1870–1884. doi: 10.1016/j.neubiorev.2012.05.008. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction-error-dependent processing of immediate and delayed positive feedback

Affiliations

Prediction-error-dependent processing of immediate and delayed positive feedback

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous