Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan 10;493(7431):216-20.
doi: 10.1038/nature11690. Epub 2012 Nov 28.

Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants

Collaborators, Affiliations

Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants

Wenqing Fu et al. Nature. .

Erratum in

  • Nature. 2013 Mar 14;495(7440):270. Rieder, Mark J [added]

Abstract

Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1. The vast majority of protein-coding SNVs arose recently
a, The site frequency spectrum for EAs (red) and AAs (blue). b, Cumulative proportion of SNVs for a given allele age. The inset highlights the cumulative proportion of SNVs that are estimated to have arisen in the last 50 kyr. c, Average age for all SNVs, SNVs found in both the EAs and AAs (shared), and SNVs found in only one population (specific). d, Average age for different types of variants. Error bars denote standard deviations.
Figure 2
Figure 2. Characteristics of allele age for deleterious SNVs
a and b, average age of nonsynonymous and other SNVs as a function of the number of methods that predict the variant to be deleterious. Pie charts represent the proportion of SNVs that arose less than (black) or more than (white) 5 kyr. Error bars denote standard deviations. c, Relationship between the proportion of SNVs predicted to be deleterious and SNV age. Note, >99% of deleterious SNVs are estimated to have arisen in the past 150 Kyr. Solid lines represent a loess fit to the data.
Figure 3
Figure 3. Distribution of deleterious SNVs across the exome before and after recent accelerated population growth
a, Rectangles represent the set of all protein-coding sequences for each chromosome. Vertical red and blue lines in EAs and AAs, respectively, denote deleterious SNVs. The distributions of deleterious SNVs across the exome before and after recent accelerated population growth are shown in the left and right panels, respectively. b, The bar plots summarize the number of genes segregating one or more deleterious SNVs that arose before (left) or after (right) recent accelerated population growth.
Figure 4
Figure 4. Heterogeneity of allele age across genes and pathways
a, Distribution of the proportion of deleterious SNVs for Mendelian, complex, essential, and other genes in EAs (top) and AAs (bottom) versus age in kyr. Data for each of the four categories of genes is shown in each plot, with darker lines representing the specific gene class indicated by the column label. Shaded regions define 95% confidence intervals obtained by bootstrapping. b, Average ages for deleterious (projecting up) and all (projecting down) SNVs across 235 KEGG pathways that can be organized into six broad classes (see legend on the right). Each of the six classes is comprised of multiple sub-classes, indicated by the different color shadings.

Comment in

References

    1. Kimura M, Ota T. The age of a neutral mutant persisting in a finite population. Genetics. 1973;75:199–212. - PMC - PubMed
    1. Tishkoff SA, Verrelli BC. Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet. 2003;4:293–340. - PubMed
    1. Slatkin M, Rannala B. Estimating allele age. Annu Rev Genomics Hum Genet. 2000;1:225–249. - PubMed
    1. Keinan A, Clark AG. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science. 2012;336:740–743. - PMC - PubMed
    1. Nelson MR, et al. An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People. Science. 2012;337:100–104. - PMC - PubMed

Publication types