. 2025 Oct 15;113(20):3458-3475.e12.

doi: 10.1016/j.neuron.2025.07.008. Epub 2025 Aug 7.

Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics

Affiliations

¹ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Sainsbury Wellcome Centre, University College London, London W1T 4JG, UK. Electronic address: m.bukwich@ucl.ac.uk.
² Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA. Electronic address: mgcampb@fas.harvard.edu.
³ Department of Statistics, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA.
⁴ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
⁵ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Department of Psychology, Harvard University, Cambridge, MA 02138, USA.
⁶ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon 16419, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea.
⁷ Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA.
⁸ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA. Electronic address: uchida@mcb.harvard.edu.

PMID: 40780211
PMCID: PMC12784418
DOI: 10.1016/j.neuron.2025.07.008

Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics

Michael Bukwich et al. Neuron. 2025.

. 2025 Oct 15;113(20):3458-3475.e12.

doi: 10.1016/j.neuron.2025.07.008. Epub 2025 Aug 7.

Authors

Affiliations

¹ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Sainsbury Wellcome Centre, University College London, London W1T 4JG, UK. Electronic address: m.bukwich@ucl.ac.uk.
² Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA. Electronic address: mgcampb@fas.harvard.edu.
³ Department of Statistics, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA.
⁴ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
⁵ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Department of Psychology, Harvard University, Cambridge, MA 02138, USA.
⁶ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon 16419, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea.
⁷ Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA.
⁸ Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Center for Brain Science, Harvard University, Cambridge, MA 02138, USA. Electronic address: uchida@mcb.harvard.edu.

PMID: 40780211
PMCID: PMC12784418
DOI: 10.1016/j.neuron.2025.07.008

Abstract

Patch foraging is a ubiquitous decision-making process in which animals decide when to abandon a resource patch of diminishing value to pursue an alternative. We developed a virtual foraging task in which mouse behavior varied systematically with patch value. Behavior could be explained by models integrating time and rewards antagonistically, scaled by a slowly varying latent patience state. Describing a mechanism rather than a normative prescription, these models quantitatively captured deviations from optimal foraging theory. Neuropixels recordings throughout frontal areas revealed distributed ramping signals, concentrated in the frontal cortex, from which multiple integrator models' decision variables could be decoded equally well. These signals reflected key aspects of decision models: they ramped gradually, responded oppositely to time and rewards, were sensitive to patch richness, and retained memory of reward history. Together, these results identify integration via frontal cortex ramping dynamics as a candidate mechanism for solving patch-foraging problems.

Keywords: Marginal Value Theorem; Neuropixels; decision making; foraging; frontal cortex; latent state; neural integration; patch foraging; ramping activity; virtual reality.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

**Figure 1:. Mice foraging times qualitatively match MVT.**
A) Schematic of virtual patch-foraging task. B) Combinations of reward sizes and frequencies yield nine patch types (Fig. S1A). C) Probability of reward after one-second intervals per frequency condition. D) Example trials from two mice. Mice stop in response to proximity cues to enter patches and receive stochastic water rewards. Brightness was increased for task state photos. E) Reward deliveries (circles) and patch-leave times (triangles) from example session grouped by patch reward size. Trials are sorted in ascending order of PRT from top to bottom. Circles’ coloring indicates underlying frequency condition. F) PRT per patch type from an example mouse. Column groupings separate patches by reward size. Within columns, patches are split by reward frequency (p<0.0001 for size and frequency, GLM). Box: median ± IQR; whiskers = 5-95%; points = outliers. G) A generalized linear mixed-effects model (GLME) with log link function predicts PRTs across conditions. Points indicate mouse PRTs with error bars showing SEM. Lines represent GLME fits. Colored by mouse ID ‘i’. H) Coefficients from linear regression of normalized PRT on current trial reward size/frequency and average reward size/frequency of the preceding 5 trials. Points show individual mice, and black error bars show mean ± SEM over mice (n=22). Coefficients on current reward size/frequency were positive, whereas coefficients on previous reward size were negative (current reward size coefficient = 0.18 ± 0.02, t-test versus zero p=4.2×10⁻¹⁰; current reward frequency coefficient = 0.084 ± 0.0084, p=2.2×10⁻⁹; previous reward size coefficient = −0.029 ± 0.0087, p=0.0033; n=22 mice). Coefficient on previous reward frequency (a noisier indicator of patch quality) did not differ significantly from zero (previous reward frequency coefficient = 0.0032 ± 0.008, p=0.69; n=22 mice). ns, not significant, ** p<0.01, **** p<0.0001, t-test versus zero.

**Figure 2:. Scaling a common function explains PRT variance.**
A) Left: Mean PRT per μL across sessions, example mouse. Right: Standard deviation of mean PRT across sessions per subject, colored by mouse ID. B) PRT autocorrelation over patches, mean across subjects. Error bars indicate standard deviation across subjects. C) Estimating latent “patience” state from surrounding PRTs. Left: Example session. Gray trace tracks PRT with colored points indicating reward size for each patch. Black trace indicates estimated latent state (Gaussian filtered PRT, omitting current trial). Middle: Estimated latent state for all sessions from an example mouse. Right: Mean normalized latent estimates across sessions per mouse. Black line shows mean across mice. D) Coefficient of variation (CV) of latent state within versus across sessions per mouse. For within session measures, CV of latents across patches was calculated per session, and mean CV is shown. For across session measures, mean latent was calculated per session, and CV over these means is shown. For the across mice measure, mean latent was calculated per mouse, and CV over these means is shown (red line). E) Generalized linear model (GLM) scaled by latent patience estimates. F) R² statistics for predicting PRT using: GLME (from Fig. 1G), mean latent per session, per trial latent, and latent-scaled GLM. Ordered by ascending mean R², stars indicate significance of pairwise Wilcoxon signed-rank tests (p=0.0044, 0.0027, 0.002, Bonferroni-adjusted).

**Figure 3:. Competitive integration processes explain foraging behavior.**
A) Instantaneous expected reward rate at time of patch leave across patch types, predicted by idealized MVT (Left), example mouse (Middle, Bonferroni-adjusted p<0.0001 for size, p>0.99 for frequency, 2-way ANOVA), and population (Right, p<0.0001 for size and frequency, linear mixed-effects model). Error bars for example mouse indicate SEM. Error bars for population average indicate standard deviation across mice. B) Schematic demonstrating MVT predictions for PRT per reward size for two different thresholds. Traces are colored by reward size. Dashed lines show two sample thresholds, a and b. Black dots indicate threshold crossing points. Gray box notes examples for **Prediction 2** (Top) and **Prediction 3** (Bottom). C) Mean PRT differences between ( $4 μ L - 2 μ L)$ versus ( $2 μ L - 1 μ L)$ patches. Dashed line indicates unity. D) Sigmoid transformation of decision variable (DV) into leave probability per one-second interval ( $P (Leave) / s$ ), scaled by different inverse temperature values ( $Ψ$ , shades of gray). Red dashed line indicates $P_{\max}$ , the maximum $P (Leave) / s$ output. E) Example traces of DV (Top) and corresponding $P (Leave) / s$ output (Bottom) for integrator models over an example patch with rewards at $t =$ [0,1,4,5] seconds. Red dashed line indicates $P_{\max}$ . F) Schematic demonstrating $D V$ and $P (Leave) / s$ scaling by latent state, for an example patch with rewards at $t =$ [0, 4] seconds. Black/red traces indicate patches with higher/lower latent state estimates and ramp up less/more quickly. G) Relative BIC values for model fits across subjects. H) Schematics demonstrating differing Models 2 (Left) and 3 (Right) predictions on patches with rewards at $t = [0,2]$ sec (‘R0R’ patches, black) versus $t = [0,1, 2]$ sec (‘RRR’ patches, blue). I) Per-subject mean simulated PRT for ‘R0R’ versus ‘RRR’ patches from Model 2 (Left), Model 3 (Middle), and empirical mice PRT (Right). Model 3 and mice PRTs were higher for ‘RRR’ versus ‘R0R’ trials (p<0.0001, Model 3; p=0.0024, Mice; Wilcoxon signed-rank test). There was a small but significant effect of greater PRTs for ‘R0R’ versus ‘RRR’ trials for Model 2 (p=0.0037). This was due to selection bias over latent state (‘R0R’ trials tend to have higher latent state than ‘RRR’ trials, lengthening PRT) and was in the opposite direction of the empirical mice data. Points are colored per mouse. Dashed line indicates unity. J) Left: Mean PRT across patch types, per subject, from Model 3 simulations versus empirical mouse PRT, log-scaled, colored per mouse ( $R^{2}$ = 0.985, MSE = 0.413s). Right: Mean instantaneous expected reward rate at time of patch leave from Model 3 simulations (as in ‘A’). K) Schematic demonstrating integrator models can account for deviations from MVT predictions. Example traces are shown when patience is lower (solid lines) versus higher (dashed lines) for each reward size (color). Patch-leave times are determined by integrator value reaching threshold (black dashed line, points indicate threshold crossings). Gray box notes example violations of **Prediction 2** (Top) and **Prediction 3** (Bottom). L) Mean Model3 simulated PRT differences between ( $4 μ L - 2 μ L)$ versus ( $2 μ L - 1 μ L)$ patches. Dashed line indicates unity. M) Schematic depicting how model-predicted PRT is calculated on a single trial (Methods). N) Left: $R^{2}$ statistics for single-trial predictions from cross-validated Model 3 fits across mice (Left, median $R^{2} = 0.544$ ). Box: median ± IQR; whiskers = 5-95%. Right: Single-trial Model 3-predicted PRT (cross-validated) versus true PRT for an example mouse ( $R^{2}$ = 0.801). Colors indicate reward size per patch. O) Model 3-predicted PRT and empirical mouse PRT across patches from example session in Fig. 2C. Colored points indicate mouse PRT per reward size. Black points indicate Model 3 cross-validated predicted PRT. Lines connecting points highlight the difference in predicted versus empirical PRT. Gray trace indicates latent state estimate for each patch.

**Figure 4:. Reward-suppressed ramps are prevalent in frontal cortex.**
A) Example histology slice. B) Distribution of recorded brain areas. C) Example neuron with ramping activity suppressed by reward delivery. Left: PSTH aligned to patch stop, split by reward size (trials with reward at $t = 1$ omitted). Middle: PSTH aligned to patch leave, split by reward size. Right: PSTH aligned to patch stop, split by whether reward was delivered at 1 second (red) or not (black) (rewards of different size combined). D) Hand-picked principal components (PCs) of neural activity showing integrator-like activity. For simplicity, only 4 $μ L$ trials are shown. Magenta/black traces are trials with/without reward at $t = 1$ . Lines and shaded area = mean ± SEM over trials. E) Total variance explained by PCs with significant ramping slopes (black) versus a shuffle control (gray). **** p<0.0001, data versus shuffle, sign rank test (n=33 sessions). F) Histogram of correlation between single neuron firing rates and Model 3 DV (Fig. 3). Red/black indicate neurons with/without significant correlation versus shuffle control (z-test p<0.001). G) $R^{2}$ between individual neurons’ firing rates and the Model 3 DV, by brain region. Points represent recording sessions (left) or mice (right). Frontal cortex areas had higher mean $R^{2}$ values than subcortical areas (p<0.01, paired t-test, n=9 mice). **Frontal Cortex Areas:** OFC: Orbitofrontal cortex, ACC: Anterior cingulate cortex, PL: Prelimbic cortex, IL: Infralimbic cortex, M2: Secondary motor cortex, M1: Primary motor cortex. **Subcortical Areas:** DMS: Dorsomedial striatum, DP: Dorsal peduncular area, LS: Lateral septum, OLF: Olfactory areas, STR: Striatum, TTd: Taenia tecta dorsal part, VS: Ventral striatum.

**Figure 5:. DV decoder output shares features of Models 2 and 3.**
A) Schematic of decision variable (DV) decoding. B) Decoder output (red) versus true Model 3 DV (black) for several contiguous patches from example recording sessions. C) CV R² for Model DVs (Models 1-3, Fig. 3), by Session (n=28, left), or Mouse (n=9, right). 5/33 sessions with CV $R^{2} < - 0.1$ were excluded from further analysis. n.s. Not Significant (p>0.05, one-way ANOVA). D) Within-session correlations between DVs for pairs of models, averaged across sessions per mouse. E) Same as (D), but for neural predictions (decoder outputs). F) Same as (D), but for neural regression coefficients. G) Comparison of $R^{2}$ between decoders using only units from frontal cortex (Ctx) or subcortical areas (Sub). Each line represents a recording session, and colors represent mice. Black lines and error bars = mean ± SEM over sessions. ** p<0.01, *** p<0.001, paired t-test (n=23 sessions). H) PSTHs of DVs (left) and neural predictions (right; decoder output trained on those DVs) on trials with no reward at $t = 1$ , which were used to estimate ramping slope. Lines and shaded areas = means ± SEMs over sessions. Colors indicate reward size. In panels H-P, results are shown for behavioral models 1-3, arranged from top to bottom. I) Estimated ramping slope for DVs (left) and neural predictions (right). Colored lines show individual mice (slopes averaged over sessions). Black lines and error bars show means ± SEM over mice. ** p<0.01, *** p<0.001, reward size coefficient, LME. J) Same as (H), but aligned to reward deliveries, excluding rewards at $t = 0$ . K) Reward responses of DVs (left) and neural predictions (right). Significance level reflects the reward size coefficient in a linear model. For neural predictions, average response across reward sizes is also shown (“Avg”); stars indicate significance of a t-test versus zero (n=9 mice). N.S. Not Significant (p>0.05), * p<0.05, ** p<0.01, *** p<0.001. L) Slope from linear regression of reward response on pre-reward level of DVs (left) and neural predictions from decoders (right). For all three decoders, slope was approximately −0.5 for all reward sizes, indicating partial resetting. M) PSTHs of DVs on trial types isolating the effect of reward history: trials with rewards at $t = 0$ and 2 seconds (‘R0R’) versus trials with rewards at $t = 0, 1,$ and $2$ seconds (‘RRR’). Lines and shaded areas indicate means ± SEMs over sessions. Trials are split by reward size and R0R vs RRR reward sequences, with lighter shades indicating R0R trials and darker shades indicating RRR trials. N) Same as (M) but showing neural predictions on R0R versus RRR trials. O) Comparison of the DV level just after the reward at $t = 2$ seconds between R0R (x-axis) and RRR (y-axis) trials for each reward size. P-values for TrialType (R0R vs. RRR) in an LME are shown for each behavioral model. By construction, only Model 3 has a significant TrialType coefficient, indicating sensitivity to reward history. P) Same as (O), but for neural predictions. All decoders had significant TrialType coefficients, indicating a reward history effect on decoder output most consistent with Model 3.

**Figure 6:. Functional clustering reveals module pairs with reciprocal integration dynamics**
A) Schematic of analysis approach: Task variable coefficients estimated via Poisson GLM are used to cluster neurons using a Gaussian Mixture Model (GMM). B) GMM clustering to identify clusters of neural activity patterns. **Top left:** BIC was used to select the number of clusters (minimum BIC: 6 clusters). **Top right:** Percentage of neurons assigned to clusters. Clusters were ordered so patterns with similar shapes but opposite signs were adjacent (see panel E). **Bottom:** Task-related neurons projected into the PC space used for clustering. C) Neural activity for task-related neurons on “40” trials (4 $μ L$ reward at 0 seconds, no reward at 1 second; left panel) or “44” trials (4 $μ L$ reward at 0 and 1 second; right panel; white dashed line indicates reward at 1 second). GMM cluster identity for each neuron is indicated on the right. D) Average GLM-predicted reward responses and z-scored accumulator coefficients. Z-scored reward kernel coefficients were multiplied by corresponding basis functions and summed to generate the predicted reward response. In panels D-H, lines indicated means, and shaded regions indicate SEM. E) Average PSTHs of z-scored neural activity following patch stop for each cluster, split by reward size and whether or not reward was delivered at 1 second. F) Average PSTHs of z-scored neural activity aligned to patch leave for each cluster. G) Average PSTHs of z-scored neural activity for each cluster, aligned to patch stop and split by patch residence time. H) Same as (G) but aligned to patch leave. I) Average coefficient ( $β$ ) per neuron in linear DV decoding from Models 1-3 (top to bottom; decoding as in Fig. 5), by GMM cluster. Coefficients were averaged over neurons within session first, then averaged over sessions. Lines indicate means and error bars indicate SEM over sessions (n=28 sessions). Coefficients differed by GMM cluster for all three models (LME with fixed effect of cluster and random effect per session; Model 1, p=0.0044; Model 2, p=0.0033; Model 3, p=0.021; n=28 sessions). ** p<0.01, * p<0.05. J) Same as (I), but for absolute value of decoder coefficients ( $| β |$ ), a measure of overall contribution to the decoder. $| β |$ did not differ between GMM clusters for any of the three models (Model 1, p=0.15; Model 2, p=0.94; Model 3, p=0.86; n=28 sessions). N.S. Not Significant.

**Figure 7:. Functional clusters exhibit ramping activity.**
A) Example trial showing simultaneously recorded Cluster 1 neuron activity. Top: Raster plot of Cluster 1 neurons (n=25). Middle: Raster plot of mouse licks. Bottom: Mouse speed (blue) and average firing rate of Cluster 1 neurons (red). Magenta dashed lines: Reward delivery (4 $μ L$ ). B) Schematic of ramp and step models. C) Single-trial ramping and stepping model fits for Cluster 1 neurons from an example session (80_20200317). D) Model comparison per GMM Cluster. In panels D and E, data points indicate sessions, colors indicate mice, and error bars indicate mean ± SEM across sessions. Stars indicate significance level of a t-test versus zero per GMM Cluster. *** p<0.001, ** p<0.01, * p<0.05, n.s. Not Significant. LL: Log likelihood. E) Reward coefficients and ramping slopes across reward size per GMM cluster.

See this image and copyright information in PMC

Update of

Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics.
Bukwich M, Campbell MG, Zoltowski D, Kingsbury L, Tomov MS, Stern J, Kim HR, Drugowitsch J, Linderman SW, Uchida N. Bukwich M, et al. bioRxiv [Preprint]. 2024 Sep 14:2023.09.05.556267. doi: 10.1101/2023.09.05.556267. bioRxiv. 2024. Update in: Neuron. 2025 Oct 15;113(20):3458-3475.e12. doi: 10.1016/j.neuron.2025.07.008. PMID: 37732217 Free PMC article. Updated. Preprint.

References

1. Stephens DW, and Krebs JR (1986). Foraging Theory (Princeton University Press; ). 10.2307/j.ctvs32s6b. - DOI
1. Kacelnik A (1984). Central Place Foraging in Starlings (Sturnus vulgaris). I. Patch Residence Time. Journal of Animal Ecology 53, 283–299. 10.2307/4357. - DOI
1. Charnov EL (1976). Optimal foraging, the marginal value theorem. Theor. Popul. Biol 9, 129–136. 10.1016/0040-5809(76)90040-x. - DOI - PubMed
1. Pyke GH (1984). Optimal Foraging Theory: A Critical Review. Annual Review of Ecology and Systematics 15, 523–575. 10.1146/annurev.es.15.110184.002515. - DOI
1. Nonacs P (2001). State dependent behavior and the Marginal Value Theorem. Behavioral Ecology 12, 71–83. 10.1093/oxfordjournals.beheco.a000381. - DOI

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics

Affiliations

Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources