Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 20;13(1):6486.
doi: 10.1038/s41598-023-33008-2.

Decision heuristics in contexts integrating action selection and execution

Affiliations

Decision heuristics in contexts integrating action selection and execution

Neil M Dundon et al. Sci Rep. .

Abstract

Heuristics can inform human decision making in complex environments through a reduction of computational requirements (accuracy-resource trade-off) and a robustness to overparameterisation (less-is-more). However, tasks capturing the efficiency of heuristics typically ignore action proficiency in determining rewards. The requisite movement parameterisation in sensorimotor control questions whether heuristics preserve efficiency when actions are nontrivial. We developed a novel action selection-execution task requiring joint optimisation of action selection and spatio-temporal skillful execution. State-appropriate choices could be determined by a simple spatial heuristic, or by more complex planning. Computational models of action selection parsimoniously distinguished human participants who adopted the heuristic from those using a more complex planning strategy. Broader comparative analyses then revealed that participants using the heuristic showed combined decisional (selection) and skill (execution) advantages, consistent with a less-is-more framework. In addition, the skill advantage of the heuristic group was predominantly in the core spatial features that also shaped their decision policy, evidence that the dimensions of information guiding action selection might be yoked to salient features in skill learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Action selection-execution task (boatdock) outline. Task tests if participants perform in a goal-oriented manner by situationally accepting higher motor execution costs to achieve higher reward. a On each trial, participants first select one of two cursors and then pilot that cursor from a start to a goal (start-goal pairing; SG). b Each cursor can accelerate in three unique directions, making some cursors more suitable to some SGs due to reduced direction changes. c Position of index (I), middle (M) and ring (R) finger of right hand on throttle buttons throughout the experiment, and cursor-specific throttle-vector mapping. One cursor imparts a higher motor execution cost with incongruent mapping with respect to finger position (in this case the blue cursor, but cursor colour is counterbalanced across subjects). Fuel burns any time a throttle is pushed down. Each trial allows six cumulative seconds of throttling before fuel depletes. d Throttle time linearly burns fuel, but nonlinearly increases displacement. Faster displacement is therefore more fuel efficient, however, a maximum dock displacement imparts additional temporal control requirements. e Successful docks yield a reward contingent on fuel conservation. This requires jointly maximising cursor choice for a given SG (action selection) in addition to spatial and temporal skill (action execution). Trials containing catastrophic errors—running out of fuel, leaving the grid, or docking above maximum displacement—yield no reward. f Schematic of two similar SGs with the same cursor but different performance dynamics. Three horizontal lines in each panel chart activity over time separately for each vector, while each vortex relates to a single throttle pulse. Top panel utilises fewer direction changes (marked with c1,…,cn), reaches a higher maximum displacement (depicted by diameter of largest vortex) and yields higher reward (depicted by colour). g Reward (depicted by colour), yielded on every successful trial across all participants (individual markers), is a joint function of spatial and temporal skill.
Figure 2
Figure 2
Action-selection strategy identified by DDM framework. a-b A heuristic selects the cursor with a displacement vector with the least angular offset to the start-goal (SG) vector. Complex planning (route-planning) selects the cursor based on cursor-specific reward projections, i.e., incorporating additional spatial and/or temporal parameters into cursor evaluation over the heuristic. c Strategy-specific cursor suitability is imperfectly correlated across all trials from all participants. Hotter colours describe greater Euclidean distance between S and G. d Reaction time (RT) for action selection is greater on SGs where strategies ascribe equivalent suitability to both cursors. Each marker is the mean of seven RT bins after sorting all participants' trials by relevant strategy values. Error bars depict the standard error of the mean (S.E.M.) in each RT bin. e DDM framework. A noisy evidence accumulation process terminates at a decision criterion (boundary). We hypothesised that difficulty arising in our task would modulate the rate of evidence accumulation (drift rate μ). Depending on what strategy (heuristic or complex planning) a participant was using to make action selections, we'd see greater modulation of their drift rate by difficulty arising from that strategy. For each participant's choice and RT data, two target models allowed separate drift rates μ1 and μ2 for high and low difficulty, respectively as per heuristic and route-planning strategy. There are equal trial counts in each bin for each model (~ 72 per participant per bin) by partitioning feature spaces non-uniformly (right panel). This panel also depicts how different drift rates will map onto different regions of each strategy's cursor-suitability scores, i.e., scores reflecting greater cursor similarity (hi difficulty) will be assigned μ1, while scores reflecting more obvious choices (lo difficulty) will be assigned μ2. f Example trials for each cursor and difficulty where difficulty agreed between the heuristic and route-planning strategies. g Example trials for each cursor and difficulty where difficulty disagreed between the heuristic and route-planning strategies. In panels f and g, difficulty is normalized for cross-strategy comparison (lower bars); more rightward values reflect fewer trials (%) above that difficulty h–i three groups of participants emerged from modeling, based on whether their drift rate was modulated by heuristic (n = 14) or route-planning difficulty (n = 19), or whether they were best fitted by a null model (n = 20). A scaled schematic of the DDM profile estimated for the heuristic and route-planning (route) groups. j Comparison of DDM parameters between heuristic and route groups consistent with the latter integrating additional information into decision formation. Group's differed in the sensitivity metric S = 1 + μ2)/(2B(1 +|bC|)), primarily due to route group having a credibly higher boundary (B). Route group's bias (bc) was credibly above 0, indicating a bias away from the high-cost cursor. All parameters expressed in arbitrary units, except t0 (in seconds). t0 and S parameters are aligned with the right axis. Boxes and thin lines respectively represent the interquartile range (IQR) and highest density interval (HDI) of group-specific posteriors. (*0 HDI(xheuristic-route)). k Group classifications verified independently of DDM parameters. Route group were uniquely slower to select the congruent cursor on routes requiring a multiple-segment route, where complex-planning demands likely become disproportionately greater than heuristic demands. Effect was not present in incongruent trials, see results. (*p < 0.01, Tukey-corrected). Vertical lines are S.E.M.
Figure 3
Figure 3
Heuristic group reaches state-appropriate choice more quickly and shows a spatial-specific skill advantage. a Consistent with classic decision-heuristic models, low dimensional planning aligns with faster trajectories toward state-relevant (appropriate) choice. Hierarchical binomial model of choice behaviour demonstrates trade-off between the expediency and profundity of policy formation; heuristic group exceeded chance by run 2, earlier than route group (run 4). †reflects runs where HDI of group-level θ posterior did not subtend 0.50, i.e., where group-level proportion of choices were credibly above chance. b–d Skill and skill learning suggest the dimensions of information guiding action selection are yoked to salient features in skill learning. Collapsing group-level posterior means across runs (skill), heuristic group yielded more reward with the high-cost cursor (b, histograms bottom panel), driven by superior spatial skill, i.e., the likely dominant feature in their action-selection policy (c, histograms bottom panel), with no route-heuristic difference in temporal skill (d, histograms). Asterisk relates to credible difference between route and heuristic groups, i.e., that the HDI of the deterministic distribution of their difference (heuristic-route) does not contain 0. Additionally, while route and heuristic group demonstrated skill learning in terms of reward and spatial skill (bc, line plots), route group uniquely demonstrated learning in the temporal domain, a likely feature in their action-selection policy (d, line plots). Boxes and thin lines in line plots respectively represent IQR and HDI of hierarchical posteriors constraining individual-participant posteriors for a given measure, run and cursor. In both histograms and line plots, reward is the proportion of fuel preserved per trial (higher better), spatial is the number of direction changes (fewer better) and temporal is the distance-normalized difference between max and final velocity (higher better). Time-on-task (skill learning) effects estimated from deterministic regression models fitted across draws from each run’s posterior; credible (0 ∉ coefficient HDI) effects depicted by either a dashed (logarithmic) or solid (linear) line. Absence of any line reflects noncredible time-on-task effect.

Similar articles

Cited by

References

    1. Korn CW, Bach DR. Minimizing threat via heuristic and optimal policies recruits hippocampus and medial prefrontal cortex. Nat. Hum. Behav. 2019;3(7):733–745. doi: 10.1038/s41562-019-0603-9. - DOI - PMC - PubMed
    1. Gigerenzer G, Gaissmaier W. Heuristic decision making. Ann Rev Psychol. 2011;10:451–482. doi: 10.1146/annurev-psych-120709-145346. - DOI - PubMed
    1. Gilovich T, Griffin D, Kahneman D, editors. Heuristics and Biases: The Psychology of Intuitive Judgment. Cambridge: Cambridge University Press; 2002.
    1. Gigerenzer G, Brighton H. Homo heuristicus: Why biased minds make better inferences. Top. Cogn. Sci. 2009;1(1):107–143. doi: 10.1111/j.1756-8765.2008.01006.x. - DOI - PubMed
    1. Gordon J, Maselli A, Lancia GL, Theiry T, Cisek P, Pezzulo G. The road towards understanding embodied decisions. Neurosci. Biobehav. Rev. 2021;131:722–736. doi: 10.1016/j.neubiorev.2021.09.034. - DOI - PMC - PubMed

Publication types