Structured, uncertainty-driven exploration in real-world consumer choice

Eric Schulz¹, Rahul Bhui², Bradley C Love^{3

4}, Bastien Brier⁵, Michael T Todd⁵, Samuel J Gershman²

Affiliations

¹ Department of Psychology, Harvard University, Cambridge, MA 02138; ericschulz@fas.harvard.edu.
² Department of Psychology, Harvard University, Cambridge, MA 02138.
³ Department of Experimental Psychology, University College London, London WC1H 0AP, United Kingdom.
⁴ The Alan Turing Institute, London NW1 2DB, United Kingdom.
⁵ Data Science Team, Deliveroo, London EC4R 3TE, United Kingdom.

PMID: 31235598
PMCID: PMC6628813
DOI: 10.1073/pnas.1821028116

Structured, uncertainty-driven exploration in real-world consumer choice

Eric Schulz et al. Proc Natl Acad Sci U S A. 2019.

. 2019 Jul 9;116(28):13903-13908.

doi: 10.1073/pnas.1821028116. Epub 2019 Jun 24.

Authors

Eric Schulz¹, Rahul Bhui², Bradley C Love^{3

4}, Bastien Brier⁵, Michael T Todd⁵, Samuel J Gershman²

Affiliations

¹ Department of Psychology, Harvard University, Cambridge, MA 02138; ericschulz@fas.harvard.edu.
² Department of Psychology, Harvard University, Cambridge, MA 02138.
³ Department of Experimental Psychology, University College London, London WC1H 0AP, United Kingdom.
⁴ The Alan Turing Institute, London NW1 2DB, United Kingdom.
⁵ Data Science Team, Deliveroo, London EC4R 3TE, United Kingdom.

PMID: 31235598
PMCID: PMC6628813
DOI: 10.1073/pnas.1821028116

Abstract

Making good decisions requires people to appropriately explore their available options and generalize what they have learned. While computational models can explain exploratory behavior in constrained laboratory tasks, it is unclear to what extent these models generalize to real-world choice problems. We investigate the factors guiding exploratory behavior in a dataset consisting of 195,333 customers placing 1,613,967 orders from a large online food delivery service. We find important hallmarks of adaptive exploration and generalization, which we analyze using computational models. In particular, customers seem to engage in uncertainty-directed exploration and use feature-based generalization to guide their exploration. Our results provide evidence that people use sophisticated strategies to explore complex, real-world environments.

Keywords: decision making; exploration; generalization; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Fig. 1.**
Learning and exploration over time. (A) Average order rating by number of past orders. (B) Probability of sampling a new restaurant in dependency of the number of past orders. Dashed black line indicates simulated exploratory behavior of agents randomly exploring available restaurants. (C) Distribution of order ratings for newly sampled and known restaurants. (D) Average probability of reordering from a restaurant as a function of reward prediction error. Means are displayed as black squares and error bars show the 95% confidence interval of the mean.

**Fig. 2.**
Factors influencing exploration. (A) Effect of relative price. The relative price indicates how much cheaper or more expensive a restaurant was compared with an average restaurant in the same city. (B) Effect of standardized (z-transformed) estimated delivery time. (C) Effect of average rating. (D) Effect of a restaurant’s number of past ratings (certainty). Means are displayed as black squares and error bars show the 95% confidence interval of the mean.

**Fig. 3.**
Signatures of uncertainty-directed exploration. (A) Entropy of the next four choices in dependency of RPE. (B) Probability of reordering from a restaurant in dependency of RPE, shown for restaurants with high and low relative variance. (C) Probability of choosing a novel restaurant in dependency of its difference from an average restaurant within the same cuisine type for restaurants with high and low relative variance. (D) Probability of choosing a novel restaurant in dependency of its relative price for restaurants with high and low relative variance.

**Fig. 4.**
Clusters and changes of exploration. (A) Clusters of exploration between different cuisine types within customers’ consecutive explorations. Green rectangles mark clusters of exploration. (B) Moves between clusters after better-than–expected (positive RPE) and worse-than–expected (negative RPE) outcomes compared with a restaurant-specific mean baseline. Centers of radar plots indicate a change of −5%, and outermost lines indicate a change of +5%. A change of 1% roughly translates to 500 orders.

**Fig. 5.**
Signatures of generalization. (A) Probability of switches between cuisine types and rated similarities between the same types. (B) Average rating per city and proportion of exploratory choices. Turquoise line marks least-squares regression line. (C) Predictability of a restaurant’s quality and average rating of explored restaurants. Turquoise line marks least-squares regression line. (D) Results of model comparison for new customers’ behavior. Considered models were the Bayesian mean tracker (BMT), a Gaussian process with a mean-greedy sampling strategy (GP-M), and a Gaussian process with an upper confidence bound sampling strategy (GP-UCB).

See this image and copyright information in PMC

References

1. Whittle P., Multi-armed bandits and the Gittins index. J. R. Stat. Soc. Ser. B (Methodol.) 42, 143–149 (1980).
1. Gershman S. J., Deconstructing the human algorithms for exploration. Cognition 173, 34–42 (2018). - PMC - PubMed
1. Speekenbrink M., Konstantinidis E., Uncertainty and exploration in a restless bandit problem. Top. Cognit. Sci. 7, 351–367 (2015). - PubMed
1. Frank M. J., Doll B. B., Oas-Terpstra J., Moreno F., Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009). - PMC - PubMed
1. Auer P., Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2002).

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Structured, uncertainty-driven exploration in real-world consumer choice

Affiliations

Structured, uncertainty-driven exploration in real-world consumer choice

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources