The evolutionary consequences of learning under competition

John M McNamara¹, Sasha R X Dall², Alasdair I Houston³, Olof Leimar⁴

Affiliations

¹ School of Mathematics, University of Bristol , Bristol BS8 1UG, UK.
² Centre for Ecology and Conservation, University of Exeter , Exeter TR10 9FE, UK.
³ School of Biological Sciences, University of Bristol , Bristol BS8 1TQ, UK.
⁴ Department of Zoology, Stockholm University , 10691 Stockholm, Sweden.

PMID: 39110908
PMCID: PMC11305653
DOI: 10.1098/rspb.2024.1141

The evolutionary consequences of learning under competition

John M McNamara et al. Proc Biol Sci. 2024 Aug.

. 2024 Aug;291(2028):20241141.

doi: 10.1098/rspb.2024.1141. Epub 2024 Aug 7.

Authors

John M McNamara¹, Sasha R X Dall², Alasdair I Houston³, Olof Leimar⁴

Affiliations

¹ School of Mathematics, University of Bristol , Bristol BS8 1UG, UK.
² Centre for Ecology and Conservation, University of Exeter , Exeter TR10 9FE, UK.
³ School of Biological Sciences, University of Bristol , Bristol BS8 1TQ, UK.
⁴ Department of Zoology, Stockholm University , 10691 Stockholm, Sweden.

PMID: 39110908
PMCID: PMC11305653
DOI: 10.1098/rspb.2024.1141

Abstract

Learning is a taxonomically widespread process by which animals change their behavioural responses to stimuli as a result of experience. In this way, it plays a crucial role in the development of individual behaviour and underpins substantial phenotypic variation within populations. Nevertheless, the impact of learning in social contexts on evolutionary change is not well understood. Here, we develop game theoretical models of competition for resources in small groups (e.g. producer-scrounger and hawk-dove games) in which actions are controlled by reinforcement learning and show that biases in the subjective valuation of different actions readily evolve. Moreover, in many cases, the convergence stable levels of bias exist at fitness minima and therefore lead to disruptive selection on learning rules and, potentially, to the evolution of genetic polymorphisms. Thus, we show how reinforcement learning in social contexts can be a driver of evolutionary diversification. In addition, we consider the evolution of ability in our games, showing that learning can also drive disruptive selection on the ability to perform a task.

Keywords: disruptive selection; fitness minima; hawk–dove game; negative frequency dependence; producer–scrounger game; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

The pay-off rate to a mutant that has a fixed probability. — **Figure 1.**
The pay-off rate to a mutant that has a fixed probability, $p$ , of taking action $u_{2}$ in the symmetric game. (i) Residents choose action $u_{2}$ with probability $0.5$ in each round (horizontal dashed line). (ii) Residents learn using an unbiased learning rule. Two learning rules are illustrated: AV learning with $T = 400$ (red crosses) and AC learning with $T = 100$ (blue open circles). $G = 10$ .

**Figure 2.**
Adaptive dynamics for the symmetric game. (a) and (b) give results for AV learning with $T = 400$ . (a) The strength of selection (value of the selection gradient) on the inflation bias $α$ , showing that $α = 0$ is a convergence stable point. (b) The pay-off rate to a mutant when the resident strategy is $α = 0$ . (c) and (d) give results for AC learning with $T = 100$ . (c) The strength of selection on the initial bias $θ_{0}$ , showing that $θ_{0} = 0$ is a convergence stable point. (d) The pay-off rate to a mutant when the resident strategy is $θ_{0} = 0$ . $G = 10$ .

**Figure 3.**
Adaptive dynamics for the producer–scrounger game. (a) and (b) give results for AV learning with $T = 400$ . (a) The strength of selection on the inflation bias $α$ , showing that there is a convergence stable point at approximately $α = - 0.05422$ . (b) The pay-off rate to a mutant when the resident strategy is $α = - 0.05422$ . (c) and (d) give results for AC learning with $T = 100$ . (c) The strength of selection on the initial bias $θ_{0}$ showing a convergence stable point at approximately $θ_{0} = - 2.688$ . (d) The pay-off rate to a mutant when the resident strategy is $θ_{0} = - 2.688$ . $G = 10$ . Foraging parameters $e_{p} = 2$ , $e_{s} = 3$ .

Distribution of the evolved bias in the producer–scrounger game after generations. — **Figure 4.**
Distribution of the evolved bias in the producer–scrounger game after 50 000 generations. (a) Inflation bias $α$ under AV learning when reproduction is sexual. (b) Initial bias $θ_{0}$ under AC learning when reproduction is sexual. (c) Inflation bias $α$ under AV learning when reproduction is asexual. (d) Initial bias $θ_{0}$ under AC learning when reproduction is asexual. $T = 400$ for AV learning $T = 100$ for AC learning. $G = 10$ . Foraging parameters $e_{p} = 2$ , $e_{s} = 3$ . Details of the evolutionary simulation are given in electronic supplementary material, §4.

**Figure 5.**
Ability bias in the producer–scrounger and hawk–dove games. (a) Strength of selection on ability bias $a$ in the producer–scrounger game, showing a convergence stable value at approximately $a^{*} = - 0.2897$ . (b) The pay-off to a mutant with given ability bias when the resident population has ability bias $a = a^{*}$ in the producer–scrounger game. Two cases are illustrated: (i) there is no initial bias during learning ( $θ_{0} = 0$ , blue squares) and (ii) a mutant with ability bias $a$ has initial bias $θ_{0} = - 0.4 (a - a^{*})$ (black triangles). (c) and (d) give analogous results for the hawk–dove game, for which $a^{*} = 0.218$ . AC learning with $T = 100$ . $e_{p} = 2$ , $e_{s} = 3$ , $V = 2, C = 4$ , $G = 10$ .

See this image and copyright information in PMC

References

1. Shettleworth SJ. 2009. Cognition, evolution, and behavior. In Cognition, evolution, and behavior. Oxford, UK: Oxford University Press. (10.1093/oso/9780195319842.001.0001) - DOI
1. Dunlap AS, Austin MW, Figueiredo A. 2019. Components of change and the evolution of learning in theory and experiment. Anim. Behav. 147 , 157–166. (10.1016/j.anbehav.2018.05.024) - DOI
1. Stamps JA. 2016. Individual differences in behavioural plasticities. Biol. Rev. Camb. Philos. Soc. 91 , 534–567. (10.1111/brv.12186) - DOI - PubMed
1. Wright J, Haaland TR, Dingemanse NJ, Westneat DF. 2022. A reaction norm framework for the evolution of learning: how cumulative experience shapes phenotypic plasticity. Biol. Rev. Camb. Philos. Soc. 97 , 1999–2021. (10.1111/brv.12879) - DOI - PMC - PubMed
1. Stephens DW. 1991. Change, regularity, and value in the evolution of animal learning. Behav. Ecol. 2 , 77–89. (10.1093/beheco/2.1.77) - DOI

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

Vetenskapsrådet

LinkOut - more resources

Full Text Sources
- Atypon
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The evolutionary consequences of learning under competition

Affiliations

The evolutionary consequences of learning under competition

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources