Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul:140:26-39.
doi: 10.1016/j.cortex.2021.03.012. Epub 2021 Mar 27.

The aversion positivity: Mediofrontal cortical potentials reflect parametric aversive prediction errors and drive behavioral modification following negative reinforcement

Affiliations

The aversion positivity: Mediofrontal cortical potentials reflect parametric aversive prediction errors and drive behavioral modification following negative reinforcement

Eric Rawls et al. Cortex. 2021 Jul.

Abstract

Reinforcement learning capitalizes on prediction errors (PEs), representing the deviation of received outcomes from expected outcomes. Mediofrontal event-related potentials (ERPs), in particular the feedback-related negativity (FRN)/reward positivity (RewP), are related to PE signaling, but there is disagreement as to whether the FRN/RewP encode signed or unsigned PEs. PE encoding can potentially be dissected by time-frequency analysis, as frontal theta [4-8 Hz] might represent poor outcomes, while central delta [1-3 Hz] might instead represent rewarding outcomes. However, cortical PE signaling in negative reinforcement is still poorly understood, and the role of cortical PE representations in behavioral reinforcement learning following negative reinforcement is relatively unexplored. We recorded EEG while participants completed a task with matched positive and negative reinforcement outcome modalities, with parametrically manipulated single-trial outcomes producing positive and negative PEs. We first demonstrated that PEs systematically influence future behavior in both positive and negative reinforcement conditions. In negative reinforcement conditions, mediofrontal ERPs positively signaled unsigned PEs in a time window encompassing the P2 potential, and negatively signaled signed PEs for a time window encompassing the FRN/RewP and frontal P3 (an "aversion positivity"). Central delta power increased parametrically with increasingly aversive outcomes, contributing to the "aversion positivity". Finally, negative reinforcement ERPs correlated with RTs on the following trial, suggesting cortical PEs guide behavioral adaptations. Positive reinforcement PEs did not influence ERP or time-frequency activity, despite significant behavioral effects. These results demonstrate that mediofrontal PE signals are a mechanism underlying negative reinforcement learning, and that delta power increases for aversive outcomes might contribute to the "aversion positivity."

Keywords: Feedback-related negativity (FRN); Negative reinforcement; Prediction error; Reinforcement learning; Reward positivity (RewP); Salience; Signed; Unsigned; Value.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare no competing financial interests.

Figures

Figure 1.
Figure 1.
Task diagram and analysis for the modified reinforcement flanker paradigm. Participants were cued as to whether the current trial was to be positive or negative reinforcement with a white square or a white circle, respectively. Participants then had to respond to a flanker arrow stimulus that was either congruent or incongruent (congruent and incongruent trials were considered equivalent for the current manuscript). Only correct trials were analyzed (trials n and n+1). Finally, participants were given some amount of points (on average +50 for positive reinforcement and +0 for negative reinforcement). Every trial, the amount of points given deviated slightly from the overall mean expectation, generating outcomes that were worse-than-expected or better-than-expected. All task events outlined in red are used as control variables of no interest in analysis of RTs, ERP, and time-frequency regressions. Regressions were run separately for positive and negative reinforcement conditions. Behavior and brain analysis proceeded with three regressors: 1) signed PE, 2) unsigned PE (absolute value of signed PE), 3) weighted PE (interaction of signed and unsigned PE). Regressors of interest are outlined in green. The dependent variable for behavioral analysis was the RT immediately following reinforcement outcomes, and the dependent variable for EEG analysis was the single-trial ERP amplitude (or delta/theta-band power) time-locked to reinforcement outcomes (PEs). Dependent variables are outlined in blue.
Figure 2.
Figure 2.
Reaction times immediately following reinforcing outcomes (trial n+1) were analyzed using single-trial regression within subjects, with trial n signed PE, unsigned PE, and the interaction (signed × unsigned PE) as regressors. This analysis also included trial n-1 reinforcement type (positive or negative), signed PE, unsigned PE, and the interaction (signed × unsigned PE), as well as accuracy, as regressors of no interest. Within-subject regression weights were tested for significance against a null-hypothesis mean of zero using one-sample t-tests. Distribution dot-plots show individual subject regression weights, with error bars corresponding to ±SEM. Distribution dot-plots were produced using the plotSpread.m function from the MATLAB file exchange (https://www.mathworks.com/matlabcentral/fileexchange/37105-plot-spread-points-beeswarm-plot). Significant moderation effects were probed at −1 SD and +1 SD as suggested by (Baron & Kenny, 1986).
Figure 3.
Figure 3.
Results of single-trial ERP regression analysis. All topographic plots were masked using an alpha of .05, corrected for multiple comparisons using the false discovery rate. All line plots (regression weights and ERPs) are plotted with shading corresponding to ±SEM. For negative reinforcement conditions, a strong negative signed PE was present over mediofrontal sensors from ~200 – 550 ms. An unsigned PE was present over mediofrontal sensors at ~150 ms, and this effect shifted to be more posterior over time. Other than a brief signed PE representation over parietal sensors (~150 ms), no regression effects were significant for positive reinforcement conditions.
Figure 4.
Figure 4.
Results of single-trial delta time-frequency regression analysis. All topographic plots were masked using an alpha of .05, corrected for multiple comparisons using the false discovery rate. All line plots (regression weights and power) are plotted with shading corresponding to ±SEM. For negative reinforcement conditions, a strong negative signed PE was present over central sensors from ~100 – 650 ms. No regression effects were significant for positive reinforcement conditions.
Figure 5.
Figure 5.
Results of single-trial theta time-frequency regression analysis. All topographic plots were masked using an alpha of .05, corrected for multiple comparisons using the false discovery rate. All line plots (regression weights and power) are plotted with shading corresponding to ±SEM. For negative reinforcement conditions, a negative signed PE was present over frontal sensors from ~100 – 200 ms, and a negative signed PE effect was significant over occipital sensors from 100 – 400 ms. This occipital effect was qualified by an interaction of signed and unsigned PE. No regression effects were significant for positive reinforcement conditions.
Figure 6.
Figure 6.
Single-trial partial Spearman correlations between ERP amplitude and following-trial reaction times. For negative reinforcement, results indicated that central ERP activity during the FRN/RewP time period (maximal at ~250 ms) significantly positively predicted response times on the following (correct) trial. No correlation results passed significance for positive reinforcement conditions. Nonsignificant results are masked in the topographic plot (top), and red shading in the bottom plot indicates ±SEM around the Spearman correlation coefficients. The blue shaded rectangle in the timecourse plot indicates timepoints that were significant following correction for multiple comparisons.

Similar articles

Cited by

References

    1. Baer RA, Smith GT, Hopkins J, Krietemeyer J, & Toney L (2006). Using self-report assessment methods to explore facets of mindfulness. Assessment, 13(1), 27–45. 10.1177/1073191105283504 - DOI - PubMed
    1. Baron RM, & Kenny DA (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173. 10.1037/0022-3514.51.6.1173 - DOI - PubMed
    1. Beatty P, Buzzell G, Roberts D, & Mcdonald C (2020). Contrasting time and frequency domains: ERN and induced theta oscillations differentially predict post-error behavior. Cognitive, Affective, & Behavioral Neuroscience, 20. 10.3758/s13415-020-00792-7 - DOI - PubMed
    1. Bellebaum C, & Daum I (2008). Learning-related changes in reward expectancy are reflected in the feedback-related negativity. The European Journal of Neuroscience, 27(7), 1823–1835. 10.1111/j.1460-9568.2008.06138.x - DOI - PubMed
    1. Benjamini Y, & Hochberg Y (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300. JSTOR.

Publication types

LinkOut - more resources