Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Randomized Controlled Trial
. 2020 Jul 1;40(27):5273-5282.
doi: 10.1523/JNEUROSCI.2586-19.2020. Epub 2020 May 26.

Dopamine Modulates Dynamic Decision-Making during Foraging

Affiliations
Randomized Controlled Trial

Dopamine Modulates Dynamic Decision-Making during Foraging

Campbell Le Heron et al. J Neurosci. .

Abstract

The mesolimbic dopaminergic system exerts a crucial influence on incentive processing. However, the contribution of dopamine in dynamic, ecological situations where reward rates vary, and decisions evolve over time, remains unclear. In such circumstances, current (foreground) reward accrual needs to be compared continuously with potential rewards that could be obtained by traveling elsewhere (background reward rate), to determine the opportunity cost of staying versus leaving. We hypothesized that dopamine specifically modulates the influence of background, but not foreground, reward information when making a dynamic comparison of these variables for optimal behavior. On a novel foraging task based on an ecological account of animal behavior (marginal value theorem), human participants of either sex decided when to leave locations in situations where foreground rewards depleted at different rates, either in rich or poor environments with high or low background reward rates. In line with theoretical accounts, people's decisions to move from current locations were independently modulated by changes in both foreground and background reward rates. Pharmacological manipulation of dopamine D2 receptor activity using the agonist cabergoline significantly affected decisions to move on, specifically modulating the effect of background reward rates. In particular, when on cabergoline, people left patches in poor environments much earlier. These results demonstrate a role of dopamine in signaling the opportunity cost of rewards, not value per se. Using this ecologically derived framework, we uncover a specific mechanism by which D2 dopamine receptor activity modulates decision-making when foreground and background reward rates are dynamically compared.SIGNIFICANCE STATEMENT Many decisions, across economic, political, and social spheres, involve choices to "leave". Such decisions depend on a continuous comparison of a current location's value, with that of other locations you could move on to. However, how the brain makes such decisions is poorly understood. Here, we developed a computerized task, based around theories of how animals make decisions to move on when foraging for food. Healthy human participants had to decide when to leave collecting financial rewards in a location, and travel to collect rewards elsewhere. Using a pharmacological manipulation, we show that the activity of dopamine in the brain modulates decisions to move on, with people valuing other locations differently depending on their dopaminergic state.

Keywords: decision making; dopamine; foraging; opportunity cost; reward.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Patch leaving paradigm. A, Participants had to decide how long to remain in their current patch (field), in which reward (milk) was returned at an exponentially decreasing rate (displayed on the screen by continuous filling [white bar] of the silver bucket), before moving on to the next patch, which incurred a fixed cost of 6 s during which they could collect no reward. Their goal was to maximize milk return across the whole experiment. The instantaneous rate of bucket filling indicated the foreground reward rate, whereas the colored frame indicated the distribution of different patch types and their average value, and thus the background reward rate. Participants were aware they had ∼10 min in each environment (which were blocked), but were not shown any cues to indicate how much total time had elapsed. Following a leave decision, a clock ticking down the 6 s travel time was presented. B, Three foreground patch types were used, differing in the scale of filling of the milk bucket (low, medium, and high yield), which determined the foreground reward rate. Two different background environments (farms) were used, with the background reward rate determined by the relative proportions of these patch types. The rich environment contained a higher proportion of high-yield fields, and a lower proportion of low-yield ones, meaning it had a higher background reward rate than the green farm, which had a higher proportion of low-yield fields. C, According to MVT, participants should leave each patch when the instantaneous reward rate in that patch (gray lines) drops to the background environmental average (gold and green dotted lines). Therefore, people should leave sooner from all patches in rich (gold dotted line) compared with poor (green dotted line) environments, but later in high-yield compared with low-yield patches. Crucially, these two effects are independent from each other.
Figure 2.
Figure 2.
Healthy human foragers are guided by MVT principles. A, Raw patch-leaving times. Participants (N = 39) left patches later when the background environment was poor, compared with rich (p < 0.00001), and when patches had higher, compared with lower yields (p < 0.00001), with no interaction between patch type and background environment (p = 0.2). B, These effects of changing reward parameters were in the predicted direction, with participants leaving on average 4.7 s later as patch type varied, and 3.6 s later in poor compared with rich environments. There was more variation between individuals in the effects of changing background, compared with foreground, reward rates. Dashed lines indicate predicted (MVT) effects of changing reward rate on leaving time. C–E, Participants showed a bias to remain in patches longer than predicted by MVT. Mean leaving time for each environment, collapsed across patch type, is shown in C, whereas D and E demonstrate mean leaving times for each patch type in the rich and poor environments, respectively. F, The foreground (patch) reward rate at which participants chose to leave each patch varied as a function of background environmental richness (rich vs poor). G, The magnitude of this background environment effect was close to optimal (as predicted by MVT). Error bars indicate ± SEM.
Figure 4.
Figure 4.
Patch-leaving times: observed, and predicted based on MVT. A–C, Although predicted leaving times based on actual long-run background reward rate were later than optimally predicted by MVT, actual leaving times were still significantly later. Dots represent MVT predicted optimal leaving time. Dotted lines indicate predictions based on actual long-run background reward rate. Lines indicate actual behavior. Green represents poor environment. Gold represents rich environment.
Figure 3.
Figure 3.
Cabergoline alters use of background reward information to guide patch leaving. A, Mean patch-leaving times for each patch type, split by environment and drug state. B, There was a significant interaction between drug and background (environment) reward rate on leaving time, with a reduced effect of background environment ON cabergoline compared with OFF (p = 0.023). In contrast, there was no significant interaction between drug and the effect of changing foreground (patch type) reward on patch leaving (p = 0.26). Black dotted lines indicate the predicted magnitude of effect of changing patch type and background environment, based on the MVT. C, Instantaneous patch reward rate at time of leaving, collapsed across patch types. Participants showed a significant bias to leave all patch types later than optimally predicted. The effect of cabergoline was mainly driven by participants leaving patches in the poor environment earlier ON drug and, therefore, when the current patch reward rate was higher (inset). D, Alternative representation of data, plotting instantaneous reward rate at time of leaving each patch type in each environment, ON and OFF cabergoline. EH, Relationship between patch-leaving time and patch reward rate for each condition, ON and OFF cabergoline. N = 29, comparisons are within-subject. Error bars indicate ± SEM.

References

    1. Adam R, Leff A, Sinha N, Turner C, Bays P, Draganski B, Husain M (2013) Dopamine reverses reward insensitivity in apathy following globus pallidus lesions. Cortex 49:1292–1303. 10.1016/j.cortex.2012.04.013 - DOI - PMC - PubMed
    1. Barr DJ, Levy R, Scheepers C, Tily HJ (2013) Random effects structure for confirmatory hypothesis testing: keep it maximal. J Mem Lang 68:255–278. - PMC - PubMed
    1. Bateson M, Kacelnik A (1996) Rate currencies and the foraging starling: the fallacy of the averages revisited. Behav Ecol Ecol 7:341–352.
    1. Beaulieu JM, Gainetdinov RR (2011) The physiology, signaling, and pharmacology of dopamine receptors. 63:182–217. - PubMed
    1. Beierholm U, Guitart-Masip M, Economides M, Chowdhury R, Düzel E, Dolan R, Dayan P (2013) Dopamine modulates reward-related vigor. Neuropsychopharmacology 38:1495–1503. 10.1038/npp.2013.48 - DOI - PMC - PubMed

Publication types

LinkOut - more resources