Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Aug;22(8):1383-94.
doi: 10.1101/gr.133702.111. Epub 2012 Jun 4.

Human genomic disease variants: a neutral evolutionary explanation

Affiliations
Review

Human genomic disease variants: a neutral evolutionary explanation

Joel T Dudley et al. Genome Res. 2012 Aug.

Abstract

Many perspectives on the role of evolution in human health include nonempirical assumptions concerning the adaptive evolutionary origins of human diseases. Evolutionary analyses of the increasing wealth of clinical and population genomic data have begun to challenge these presumptions. In order to systematically evaluate such claims, the time has come to build a common framework for an empirical and intellectual unification of evolution and modern medicine. We review the emerging evidence and provide a supporting conceptual framework that establishes the classical neutral theory of molecular evolution (NTME) as the basis for evaluating disease- associated genomic variations in health and medicine. For over a decade, the NTME has already explained the origins and distribution of variants implicated in diseases and has illuminated the power of evolutionary thinking in genomic medicine. We suggest that a majority of disease variants in modern populations will have neutral evolutionary origins (previously neutral), with a relatively smaller fraction exhibiting adaptive evolutionary origins (previously adaptive). This pattern is expected to hold true for common as well as rare disease variants. Ultimately, a neutral evolutionary perspective will provide medicine with an informative and actionable framework that enables objective clinical assessment beyond convenient tendencies to invoke past adaptive events in human history as a root cause of human disease.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Observed patterns of population polymorphisms that are consistent with the neutral theory of molecular evolution. (A) Proportions of single nucleotide variants (SNVs) with a rejected substitution (RS) score greater than 3 in coding and noncoding regions (Goode et al. 2010). A lower degree of genetic variation is permitted by purifying selection in functionally important regions (coding) of the genome as compared with noncoding regions with no known function. (B) Proportions of nonsynonymous SNVs predicted to be damaging by SIFT and PolyPhen-2 programs (Marth et al. 2011), which show an enrichment of deleterious (nonsynonymous) variants among the rare polymorphisms (allele frequency <1%).
Figure 2.
Figure 2.
Relative occurrence of neutral and adaptive mutations that will reach high frequency (HF: 50%–90%) in a population, assuming that a fraction f of all new mutations is adaptive. An overwhelming majority of HF mutations in a population are expected to be neutral even when the adaptive mutations have a very high selective advantage (s, increment of fitness) relative to the wild-type allele. The fraction of HF adaptive mutations is not expected to exceed 3.5% even when 1% of all new mutations in a population are adaptive, and the adaptive allele has a very high selective advantage. The expected proportion of HF neutral and adaptive polymorphisms was estimated by using Sawyer and Hartl (1992) equations 11 and 12, which describe the equilibrium flux of mutant alleles reaching high frequency in an effective population size of 2N. Note that increasing the 2Ns will not increase the proportion of adaptive HF polymorphisms, as the elevated probability of entering the high-frequency range is balanced by shortening of the transient polymorphic period. Also, we have used a rather liberal upper bound on the relative proportion of adaptive mutations (f = 1%; dashed line), but the actual fraction is expected to be smaller (f = 0.2%; solid line). For simplicity, only neutral and adaptive mutations were considered in the above calculations, because deleterious mutations are unlikely to rise to very high frequencies (>50%) on their own or by hitchhiking with adaptive variation, because only a small portion of the genome is hitchhiking by adaptive substitutions (Chun and Fay 2011). In any case, the maximum number of deleterious variants with high frequency will be similar to that of beneficial variants. Also, some low-frequency neutral variants will hitchhike to high frequency, which will cause a temporary increase of a high-frequency neutral derived allele during or immediately after the fixation of the adaptive mutation. This will not have any impact on the overall expectations, because such increases are followed by a long period of below-average abundance of high-frequency variants (Kim and Stephan 2000).
Figure 3.
Figure 3.
Relationship between the cross-species evolutionary rate and minor allele frequency (MAF) for nSNVs. Slower-evolving positions host nSNVs with the lower MAFs, on average. MAFs for 15,043 neutral nSNVs from the HumVar data set were obtained for one HapMap population (CEU) (The International HapMap 3 Consortium 2010). Long-term evolutionary rates were estimated using the protein sequence alignment from the UCSC Genome Browser resource (Kumar et al. 2009; Fujita et al. 2011). For each evolutionary rate category, the median MAF is plotted along with error bars (2 SEM).
Figure 4.
Figure 4.
The cumulative frequency distribution of disease-associated variants (closed circles) and population polymorphisms (dashed lines) over long-term evolutionary substitution rates, with darker background indicating increasing evolutionary variability. (A) Nonsynonymous variants associated with Mendelian diseases; 39,578 variants from HGMD are shown (Stenson et al. 2009). (B) Nonsense variants associated with Mendelian diseases; 2806 variants from HGMD are shown. (C) Nonsynonymous variants in human mitochondrion associated with diseases; 190 disease-associated and 964 polymorphic variants from MITOMAP are shown (Ruiz-Pesini et al. 2007). (D) Nonsynonymous somatic mutations associated with cancers; 1207 driver variants from CanPredict are shown (Kaminker et al. 2007). (E) Nonsynonymous variants associated with complex diseases; 8644 variants from VARIMED are shown (Chen et al. 2010). These distributions show that protein positions harboring disease-associated variants tend to evolve at a much slower rate (lighter background) than positions harboring population polymorphisms. This pattern is the most distinct for nonsynonymous variants associated with Mendelian diseases, cancers, or mitochondrial diseases. The distinction becomes less obvious for nonsense variants, and the pattern completely disappears for nonsynonymous variants associated with complex diseases, the latter of which are expected to exhibit a neutral evolutionary pattern due to their modest effects on fecundity. Evolutionary rates in panels A, B, D, and E are estimated using multiple alignments of orthologs from 46 species (Fujita et al. 2011) following the procedure in Kumar et al. (2009). For panel C, amino acid sequence alignments for mitochondrial proteins were obtained for 28 mammalian species from MamMiBase (Vasconcelos et al. 2005) and the evolutionary rate was estimated following Kumar et al. (2009).
Figure 5.
Figure 5.
The ratio of observed to expected (O/E) numbers of personal exome variants (nonsynonymous, nSNVs) found in positions with higher to lower degrees of evolutionary conservation (slow to fast evolutionary rates). There is an underabundance of nSNVs in slow evolving positions (O/E < 1; lighter background) and an overabundance at fast evolving positions (O/E > 1; darker background). A linear regression model fits the observed data well (R2 > 0.95), which denotes a decrease in the strength of natural selection with the increasing evolutionary rate, as expected under the neutral theory. Sequences of eight HapMap exomes (Ng et al. 2009) were used to generate the figure, based on evolutionary rates estimated using 46 species alignment of vertebrates and lamprey from the UCSC Genome Browser resource (Kumar et al. 2009; Fujita et al. 2011).

References

    1. The 1000 Genomes Project Consortium 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR 2010. A method and server for predicting damaging missense mutations. Nat Methods 7: 248–249 - PMC - PubMed
    1. Ajioka RS, Jorde LB, Gruen JR, Yu P, Dimitrova D, Barrow J, Radisky E, Edwards CQ, Griffen LM, Kushner JP 1997. Haplotype analysis of hemochromatosis: Evaluation of different linkage-disequilibrium approaches and evolution of disease chromosomes. Am J Hum Genet 60: 1439–1447 - PMC - PubMed
    1. Allison AC 1954. Protection afforded by sickle-cell trait against subtertian malareal infection. Br Med J 1: 290–294 - PMC - PubMed
    1. Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE, Dudley JT, Ormond KE, Pavlovic A, Morgan AA, et al. 2010. Clinical assessment incorporating a personal genome. Lancet 375: 1525–1535 - PMC - PubMed

Publication types