Review

. 2011 May 12;366(1569):1410-24.

doi: 10.1098/rstb.2010.0311.

Controlling for non-independence in comparative analysis of patterns across populations within species

Graham N Stone¹, Sean Nee, Joseph Felsenstein

Affiliations

PMID: 21444315
PMCID: PMC3081573
DOI: 10.1098/rstb.2010.0311

Review

Controlling for non-independence in comparative analysis of patterns across populations within species

Graham N Stone et al. Philos Trans R Soc Lond B Biol Sci. 2011.

. 2011 May 12;366(1569):1410-24.

doi: 10.1098/rstb.2010.0311.

Authors

Graham N Stone¹, Sean Nee, Joseph Felsenstein

Affiliation

¹ Institute of Evolutionary Biology, The King's Buildings, West Mains Road, Edinburgh EH9 3JT, UK. graham.stone@ed.ac.uk

PMID: 21444315
PMCID: PMC3081573
DOI: 10.1098/rstb.2010.0311

Abstract

How do we quantify patterns (such as responses to local selection) sampled across multiple populations within a single species? Key to this question is the extent to which populations within species represent statistically independent data points in our analysis. Comparative analyses across species and higher taxa have long recognized the need to control for the non-independence of species data that arises through patterns of shared common ancestry among them (phylogenetic non-independence), as have quantitative genetic studies of individuals linked by a pedigree. Analyses across populations lacking pedigree information fall in the middle, and not only have to deal with shared common ancestry, but also the impact of exchange of migrants between populations (gene flow). As a result, phenotypes measured in one population are influenced by processes acting on others, and may not be a good guide to either the strength or direction of local selection. Although many studies examine patterns across populations within species, few consider such non-independence. Here, we discuss the sources of non-independence in comparative analysis, and show why the phylogeny-based approaches widely used in cross-species analyses are unlikely to be useful in analyses across populations within species. We outline the approaches (intraspecific contrasts, generalized least squares, generalized linear mixed models and autoregression) that have been used in this context, and explain their specific assumptions. We highlight the power of 'mixed models' in many contexts where problems of non-independence arise, and show that these allow incorporation of both shared common ancestry and gene flow. We suggest what can be done when ideal solutions are inaccessible, highlight the need for incorporation of a wider range of population models in intraspecific comparative methods and call for simulation studies of the error rates associated with alternative approaches.

PubMed Disclaimer

Figures

**Figure 1.**
Sources of non-independence in population data. (a) Diagrammatic representation of the history of population splits and gene flow linking populations within species. This history results in three genetic contributions to measured population phenotypes, shown diagrammatically for a set of four populations: (i) contributions owing to shared common ancestry (represented by the colours of internal branches in the population tree), (ii) evolution specific to each population owing to selection and drift (represented by colour changes along terminal branches), and (iii) impacts of gene flow (exchange of migrants or gametes) between populations (indicated by arrows, for simplicity shown only for population 1). (b) Gene flow brings into a recipient population a subset of the genetic variation in source populations. Three source populations (1–3) contribute migrants to a recipient population (4). Imagine recipient population 4 has a higher value for a trait (distribution x in the frequency distribution diagram at right) under selection/drift equilibrium than the source populations (which, for simplicity, all share distribution y). Migration into population 4 followed by interbreeding displaces the trait value distribution for this population downwards to a new equilibrium (distribution z). The impact of gene flow is greatest when, relative to a recipient population, source populations have very different equilibrium trait distributions and contribute large numbers of migrants. Under such circumstances, the phenotypes measured in any population may be a poor guide to the selective forces acting on it. Migration effects must be accounted for before local selective effects can be estimated. (c) Population models assumed by different analytical approaches discussed in the text. Assumption of population independence implies no impact of either gene flow or history. This occurs when there is no gene flow and populations are either entirely unrelated (i) or influenced only by population-specific processes (ii), as might happen when selection acting on populations is so rapid and strong that ancestral states can be ignored. Analyses that incorporate only population history (iii) assume no gene flow, while analyses that incorporate only gene flow (iv) assume no population similarity through common ancestry.

**Figure 2.**
Consequences of phylogenetic non-independence for inferring relationships between variables across populations. Consider four populations, with mean values for two variables (independent variable x and dependent response variable y) as shown at top right. Forgetting gene flow for the moment, if these populations are equally unrelated phylogenetically (a), data for them can be considered independent, and the relationship across all four populations is a positive correlation (b). However, imagine that populations 1–2 and 3–4 comprise two pairs of closely related populations (c). The high trait values shared by both 1 and 2 (and the low values shared by both 3 and 4) are likely not to be independent, but to reflect low divergence within each pair from a common ancestor with high and low trait values, respectively. Now the relationship between x and y is negative within each population pair (black lines in (d)), but positive when analysed across the ancestors of each population pair (red line). Each of these three relationships is phylogenetically independent. A different pattern of relationships among the same set of populations can generate diametrically opposing relationships between x and y, as shown in (e). Now the relationship within each species pair is positive (black fitted lines in (f), right), while the relationship across the ancestors of the two species pairs is negative. These issues pertain whether the populations are sampled in the wild or grown in a common garden or provenance trial.

See this image and copyright information in PMC

Cited by

Microsatellite DNA suggests that group size affects sex-biased dispersal patterns in red colobus monkeys.
Miyamoto MM, Allen JM, Gogarten JF, Chapman CA. Miyamoto MM, et al. Am J Primatol. 2013 May;75(5):478-90. doi: 10.1002/ajp.22124. Epub 2013 Jan 10. Am J Primatol. 2013. PMID: 23307485 Free PMC article.
Dynamics of Deleterious Mutations and Purifying Selection in Small Population Isolates.
Chen Y, Feng X, Reid K, Zhang C, Löytynoja A, Merilä J. Chen Y, et al. Mol Biol Evol. 2025 Jul 1;42(7):msaf110. doi: 10.1093/molbev/msaf110. Mol Biol Evol. 2025. PMID: 40689857 Free PMC article.
Genetic population structure accounts for contemporary ecogeographic patterns in tropic and subtropic-dwelling humans.
Hruschka DJ, Hadley C, Brewis AA, Stojanowski CM. Hruschka DJ, et al. PLoS One. 2015 Mar 27;10(3):e0122301. doi: 10.1371/journal.pone.0122301. eCollection 2015. PLoS One. 2015. PMID: 25816235 Free PMC article.
The role of spatial processes and environmental determinants in microgeographic shell variation of the freshwater snail Chilina dombeyana (Bruguière, 1789).
Bertin A, Ruíz VH, Figueroa R, Gouin N. Bertin A, et al. Naturwissenschaften. 2012 Mar;99(3):225-32. doi: 10.1007/s00114-012-0890-8. Epub 2012 Feb 12. Naturwissenschaften. 2012. PMID: 22328071
Population genomics perspectives on convergent adaptation.
Lee KM, Coop G. Lee KM, et al. Philos Trans R Soc Lond B Biol Sci. 2019 Jul 22;374(1777):20180236. doi: 10.1098/rstb.2018.0236. Epub 2019 Jun 3. Philos Trans R Soc Lond B Biol Sci. 2019. PMID: 31154979 Free PMC article. Review.

See all "Cited by" articles

References

1. Ives A. R., Zhu J. 2006. Statistics for correlated data: phylogenies, space, and time. Ecol. Appl. 16, 20–3210.1890/04-0702 (doi:10.1890/04-0702) - DOI - DOI - PubMed
1. Phillimore A. B., Hadfield J. D., Jones O. R., Smithers R. J. 2010. Differences in spawning date between populations of common frog reveal local adaptation. Proc. Natl Acad. Sci. USA 107, 8292–829710.1073/pnas.0913792107 (doi:10.1073/pnas.0913792107) - DOI - DOI - PMC - PubMed
1. Antonovics J. 1992. Toward community genetics. In Plant resistance to herbivores and pathogens: ecology, evolution, and genetics (eds Fritz R. S., Simms E. L.), pp. 426–449 Chicago, IL: University of Chicago Press
1. Helfield J. M., Naiman R. J. 2001. Effects of salmon-derived nitrogen on riparian forest growth and implications for stream productivity. Ecology 82, 2403–240910.1890/0012-9658(2001)082[2403:EOSDNO]2.0.CO;2 (doi:10.1890/0012-9658(2001)082[2403:EOSDNO]2.0.CO;2) - DOI - DOI
1. Whitham T. G., et al. 2006. A framework for community and ecosystem genetics: from genes to ecosystems. Nat. Rev. Genet. 7, 510–52310.1038/nrg1877 (doi:10.1038/nrg1877) - DOI - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Controlling for non-independence in comparative analysis of patterns across populations within species

Affiliation

Controlling for non-independence in comparative analysis of patterns across populations within species

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous