Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep;8(9):1726-1737.
doi: 10.1038/s41562-024-01930-8. Epub 2024 Jul 16.

Humans adaptively deploy forward and backward prediction

Affiliations

Humans adaptively deploy forward and backward prediction

Paul B Sharp et al. Nat Hum Behav. 2024 Sep.

Erratum in

Abstract

The formation of predictions is essential to our ability to build models of the world and use them for intelligent decision-making. Here we challenge the dominant assumption that humans form only forward predictions, which specify what future events are likely to follow a given present event. We demonstrate that in some environments, it is more efficient to use backward prediction, which specifies what present events are likely to precede a given future event. This is particularly the case in diverging environments, where possible future events outnumber possible present events. Correspondingly, in six preregistered experiments (n = 1,299) involving both simple decision-making and more challenging planning tasks, we find that humans engage in backward prediction in divergent environments and use forward prediction in convergent environments. We thus establish that humans adaptively deploy forward and backward prediction in the service of efficient decision-making.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.. Humans use PR to make decisions.
(A) Task Design. (Top) Divergent state space. States are represented by images, and darker arrows denote higher state-to-state transition probability. To help differentiate SR and PR, the starting states, trident and planet, differed in base-rates. (Bottom) Learning phase: participants were forced to choose a starting state and observed transitions from the starting to an intermediate state (e.g., trident to bell) and then from the intermediate to a final state. Decision phase: participants were instructed which states would be rewarded, and then chose the state they estimated led to the most reward. Map-Change Test: participants were queried after two instructed changes to the state space (e.g., trident leads to fox) that rendered PR unhelpful. (B) Successor Representation (SR). (Left) Starting-state SRs, with example instructed rewards under states. (Right) Value is assigned to each starting state (s) by taking the dot product of the starting-state’s SR and the vector of reward values for each successor state (s′). (C) Predecessor Representation (PR). (Left) The rewarded states’ PRs for the same query. Here, PR is more efficient than SR because PRs comprise fewer probabilities. (Right) Value is assigned to each starting state by multiplying each PR by the reward vector and then averaging these products. (D) Participants’ choices relied on backward prediction. (Top) Proportions of participants’ choices of the starting state to which backward prediction assigns higher value. (Bottom) The posterior distribution of the population mode of PR-consistent choice rate (yellow bar: 95% highest density interval). 0.5 denotes the maximal null value at which choices are not consistent with PR. (E) Map-Change Test. The accuracy of participants’ choices before and after an instructed change to map. (F) Model Comparison. We modeled participants’ choices as a function of each model’s values. Here, PRG and SRG are variants of PR and SR that only consider the greatest reward offered. MB: model-based. BR: base rate bias. Null model guesses the initial state (i.e., 0.5 probability of choosing either state). For copyright reasons, the emojis used in the study cannot be displayed and are replaced with equivalents in all figures.
Fig. 2.
Fig. 2.. Humans favor PR over SR in a divergent environment.
(A) Task Design. Divergent state space. The study followed the same three-phase structure – learning, prediction, and map-change test – as in Study 1. (B) PR Computation. In this study, prediction queries were designed such that SR and PR each favored a different starting state. For this example, participants were instructed they would be rewarded 100 points if they reached the snorkel state. To reach this state, PR favors trident because of trident’s higher base rate. The efficiency of PR here is demonstrated by the relevant PR containing only 3 probabilities. (C) SR Computation. For the same query described in panel B, the SR-based computation favors planet, since planet is more likely to lead to snorkel. Determining this, however, requires consulting SRs that are composed of a total of 23 probabilities. (D) Participants choices relied on backward prediction. The top panel shows the distribution of participants by how frequently their choices were consistent with backward prediction (>0.5) as opposed to forward prediction (<0.5). 0.5 (black line) corresponds to a participant whose choices were equally consistent with both forms of prediction. The bottom panel shows the posterior distribution of the population mode. In both panels, blue-shaded regions denote evidence of backward prediction, and red-shaded regions denote evidence of forward prediction. (E) Bias check. Participants did not display a substantial bias in favor of low- or high-base rate starting states. 0.5 on the x-axis corresponds to no bias, signified by the vertical black bar. (F) Map-Change Test. Designed as in Study 1 (Fig. 1E). The drop in performance after the instructed change to the map validates participants’ use of PR for backward prediction. (G) Model Comparison. We modelled all participants’ choices as a function of their expected value predictions, and found that PR had the greatest model evidence. MB – model-based model. BR – base rate bias model. Null model guesses which option is best (i.e., 0.5 probability of choosing either option).
Fig. 3.
Fig. 3.. Humans favor SR over PR in a convergent environment.
(A) Task Design. Convergent state space used in Study 3. The study followed the same three-phase structure – learning, prediction, and map-change test – as in Study 1 (see Fig. 1), with prediction queries now offering a choice between 2 out of 9 possible starting states. (B) PR Computation. Queries were designed such that SR and PR each favored a different starting state. In this example query, participants were instructed they would be rewarded 100 points if they reached the planet state. To reach this state, PR favors compass because of compass’s higher base rate relative to other state on offer, the train. (C) SR Computation. For the same query described in panel B, SR prefers the lower-base-rate train starting state because the rewarded outcome follows the train more often than the high-base-rate compass. The relevant SR that needs to be consulted only includes 6 probabilities, whereas PR includes 11 probabilities, making SR more efficient. (D) Participants choices relied on forward prediction. (Top) The distribution of participants by how frequently their choices were consistent with backward prediction (>0.5) as opposed to forward prediction (<0.5). 0.5 (black line) corresponds to a participant whose choices were equally consistent with both forms of prediction. (Bottom) The posterior distribution of the population mode (yellow bar: 95% highest density interval). In both panels, blue-shaded regions denote evidence of backward prediction, and red-shaded regions denote evidence of forward prediction. (E) Bias check. Participants did not display a substantial bias in favor of low- or high-base rate starting states. 0.5 on the x-axis corresponds to no bias, signified by the black vertical bar. (F) Map-Change test. The drop in performance after the instructed change to the map validates participants’ use of SR for forward prediction. (G) Model Comparison. We modeled all participants’ choices as a function of their expected value predictions, and found that SR had the greatest model evidence. MB – model-based model. BR – base rate bias model. Null model guesses which option is best (i.e., 0.5 probability of choosing either option).
Fig. 4.
Fig. 4.. Humans favor more efficient representation in multistep tasks.
(A) Multistep divergent task. The panel displays the multistep divergent (left) and convergent (right) state spaces used in Studies 4 and 5. These studies used the same phases in all previous experimental studies. Here, however, two actions were required to reach a goal – the first to select a starting state, which probabilistically led to an intermediate state, and then to choose left (‘L’) or right (‘R’), denoted by two different letters on a keyboard, to deterministically reach an end state. During the decision phase, participants planned both actions without seeing the outcomes of either. (B) Experimental results. The top panels show the distribution of participants by how frequently their choices were consistent with backward prediction (>0.5) as opposed to forward prediction (<0.5). 0.5 (black line) corresponds to a participant whose choices were equally consistent with both forms of prediction. The bottom panels show the posterior distributions of the population mode, as inferred from participants’ choices using hierarchical Bayesian modeling (yellow bar: 95% highest density interval). In all panels, blue-shaded regions denote evidence of backward prediction, and red-shaded regions denote evidence of forward prediction. On the left, panels show greater evidence of backward prediction in divergent tasks, and on the right, panels show greater evidence of forward prediction in convergent tasks. (C) Map-change Test. The accuracy of participants’ prediction on a set of single-goal queries was compared before (learnt map) and after (changed map) an instructed change to the map that rendered predictive strategies, such as PR and SR, unhelpful. The drop in performance validates participants’ use of PR to plan in the divergent task, and participants’ use of SR to plan in the convergent task. (D) Model comparison. We modeled all participants’ choices as a function of each strategy’s expected value predictions, and found that PR had the greatest model evidence in the divergent task, and SR in the convergent task. MB – model-based model. BR – base rate bias model. Null model guesses which option is best (i.e., 0.5 probability of choosing either option).

References

    1. Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E,. & Kolling N (2021). Formalizing planning and information search in naturalistic decision-making. Nature Neuroscience, 24(8), 1051–1064. - PubMed
    1. Jeong H, Taylor A, Floeder JR, Lohmann M, Mihalas S, Wu B, … & Namboodiri VMK (2022). Mesolimbic dopamine release conveys causal associations. Science, 378(6626), eabq6740. - PMC - PubMed
    1. Namboodiri VMK, & Stuber GD (2021). The learning of prospective and retrospective cognitive maps within neural circuits. Neuron, 109(22), 3552–3575. - PMC - PubMed
    1. Seitz BM, Hoang IB, DiFazio LE, Blaisdell AP, & Sharpe MJ (2022). Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner. Current Biology, 32(14), 3210–3218. - PubMed
    1. Kendig MD, & Bradfield LA (2022). Association learning: Dopamine and the formation of backward associations. Current Biology, 32(14), R769–R771. - PubMed

LinkOut - more resources