Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jan;139(1):5-21.
doi: 10.1007/s00439-019-02040-6. Epub 2019 Jun 14.

Evolutionary perspectives on polygenic selection, missing heritability, and GWAS

Affiliations
Review

Evolutionary perspectives on polygenic selection, missing heritability, and GWAS

Lawrence H Uricchio. Hum Genet. 2020 Jan.

Abstract

Genome-wide association studies (GWAS) have successfully identified many trait-associated variants, but there is still much we do not know about the genetic basis of complex traits. Here, we review recent theoretical and empirical literature regarding selection on complex traits to argue that "missing heritability" is as much an evolutionary problem as it is a statistical problem. We discuss empirical findings that suggest a role for selection in shaping the effect sizes and allele frequencies of causal variation underlying complex traits, and the limitations of these studies. We then use simulations of selection, realistic genome structure, and complex human demography to illustrate the results of recent theoretical work on polygenic selection, and show that statistical inference of causal loci is sharply affected by evolutionary processes. In particular, when selection acts on causal alleles, it hampers the ability to detect causal loci and constrains the transferability of GWAS results across populations. Last, we discuss the implications of these findings for future association studies, and suggest that future statistical methods to infer causal loci for genetic traits will benefit from explicit modeling of the joint distribution of effect sizes and allele frequencies under plausible evolutionary models.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement

No conflict of interest exists.

Figures

Figure 1:
Figure 1:
A) A pictorial representation of a stabilizing selection model. The distribution of phenotypes in the population (gray area) is centered to the right of the current optimal phenotype value (indicated by the peak in the green dashed line). Consequently, individuals with slightly lower than average phenotypes will have the highest fitness, and the distribution will shift to the left over evolutionary time. B) Distribution of selection coefficients as a function of allele frequency in a sample of 200 chromosomes in the stabilizing selection model. We display only the first 10 bins of the frequency spectrum here. Strongly selected alleles are found preferentially at low frequency. Note that s is proportional to β2 in a Gaussian stabilizing selection model. C) We plot the cumulative variance explained by alleles at or below frequency x (denoted Vx) due to each category of selection strength in B), relative to the total genetic variance (V1). We observe that a substantial amount of the genetic variance is due to large effect alleles at very low frequency, despite these alleles being exceedingly rare. Very large effect alleles are not observed at higher frequencies and hence contribute no variance in the more common part of the frequency spectrum. Nearly-neutral alleles make almost no contribution to genetic variance (|s| < 5 × 10−6), while intermediate effect alleles have a substantial impact both at low frequency and high frequency (5 × 10−3 < |s| < 5 × 10−5). Note that this figure is intended only to illustrate model behaviors – we do not imply that these parameters or calculations conform to any real trait.
Figure 2:
Figure 2:
A) Pictorial representation of the 4.9 MB region on chromosome 6 that we simulate herein. Functional elements are encoded by position along the x-axis, while recombination rate is detailed on the y. The average population scaled recombination rate in the region is 4Nr = 0.0071. B) Simplified pictorial representation of the simulated demographic model. The arrows represent migration. The dark blue represents the exponential growth events in each population. Note that two stages of growth occur in both the West African and European populations. The model is derived from those of Gravel et al. (2011) and Tennessen et al. (2012). C) PCs calculated from the simulated genetic data corresponding to one example simulation. Due to recent migrations, a handful of individuals are intermediate to the clusters corresponding to the continental groups. Abbreviations: UTRs, untranslated regions; CNCs, conserved non-coding regions; AFR, West African continental group; EUR, European continental group; EAS, East Asian continental group.
Figure 3:
Figure 3:
A) Fraction of heritability explained by significantly associated loci (hGWAS 2h2) at a variety of significance (α) thresholds as a function of the ρ parameter of the Uricchio et al. (2016) phenotype model, with low ρ representing high pleiotropy. Results represent aggregates of all alleles across 10 independent simulations and correspond to samples from the simulated African continental group. B) Fraction of heritability explained by common alleles (minor allele frequency > 1%). C) Fraction of common alleles (minor allele frequency > 0.01) with effect sizes β in the 5% tail of effect sizes (F0.05(β)) across all causal alleles.
Figure 4:
Figure 4:
Fraction of heritability explained (hexplained 2h2) in West African (AFR) and East Asian (EAS) continental groups by all causal loci in Europeans. The ρ parameter is the pleiotropy parameter of the Uricchio et al. (2016) model, with low ρ representing high pleiotropy. Box plots represent the distribution of values across 10 independent simulations.

Similar articles

Cited by

References

    1. Agarwala Vineeta, Flannick Jason, Sunyaev Shamil, Altshuler David, GoT2D Consortium, et al. Evaluating empirical bounds on complex disease genetic architecture. Nature Genetics, 45(12):1418, 2013. - PMC - PubMed
    1. Balick Daniel J, Do Ron, Cassa Christopher A, Reich David, and Sunyaev Shamil R. Dominance of deleterious alleles controls the response to a population bottleneck. PLoS Genetics, ll(8):el005436, 2015. - PMC - PubMed
    1. Barton Nicholas H. Pleiotropic models of quantitative variation. Genetics, 124(3):773–782, 1990. - PMC - PubMed
    1. Barton Nicholas H, Etheridge Alison M, and Véber Amandine. The infinitesimal model: Definition, derivation, and implications. Theoretical Population Biology, 118:50–73, 2017. - PubMed
    1. Barton Nick H. The maintenance of polygenic variation through a balance between mutation and stabilizing selection. Genetics Research, 47(3):209–216, 1986. - PubMed

LinkOut - more resources