Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;9(8):e1003709.
doi: 10.1371/journal.pgen.1003709. Epub 2013 Aug 22.

Genic intolerance to functional variation and the interpretation of personal genomes

Affiliations

Genic intolerance to functional variation and the interpretation of personal genomes

Slavé Petrovski et al. PLoS Genet. 2013.

Abstract

A central challenge in interpreting personal genomes is determining which mutations most likely influence disease. Although progress has been made in scoring the functional impact of individual mutations, the characteristics of the genes in which those mutations are found remain largely unexplored. For example, genes known to carry few common functional variants in healthy individuals may be judged more likely to cause certain kinds of disease than genes known to carry many such variants. Until now, however, it has not been possible to develop a quantitative assessment of how well genes tolerate functional genetic variation on a genome-wide scale. Here we describe an effort that uses sequence data from 6503 whole exome sequences made available by the NHLBI Exome Sequencing Project (ESP). Specifically, we develop an intolerance scoring system that assesses whether genes have relatively more or less functional genetic variation than expected based on the apparently neutral variation found in the gene. To illustrate the utility of this intolerance score, we show that genes responsible for Mendelian diseases are significantly more intolerant to functional genetic variation than genes that do not cause any known disease, but with striking variation in intolerance among genes causing different classes of genetic disease. We conclude by showing that use of an intolerance ranking system can aid in interpreting personal genomes and identifying pathogenic mutations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A regression plot illustrating the regression of Y on X.
The plot is annotated for the 2% extremes: red = 2% most intolerant, blue = 2% most tolerant. Five outlier genes with >140 common functional variant sites (y-axis) are not shown.
Figure 2
Figure 2. [A] Cumulative percentage plots for the residual variation intolerance scores among six OMIM lists. [B] ROC curves of the residual variation intolerance scores' capacity to predict the corresponding OMIM list.
Figure 3
Figure 3. ROC curves of the residual variation intolerance scores' capacity to predict the corresponding independent gene-list.
Figure 4
Figure 4. The proportion of genes explained by each of the 25-percentile bins (RVIS) for the human disease networks disorder class with the lowest “Developmental Disorders” and highest “Immunological Disorders” average residual variation intolerance score.
Figure 5
Figure 5. The percentage of de novo mutations occurring in the most intolerant quartile (25th percentile) across the severe ID, autistic, epileptic encephalopathy, and control siblings, for the different variant effect types.
LGD = Likely Gene Disrupting (including nonsense, coding indels, and splice acceptor/donor site mutations). *Taking the CCDS of RVIS genes, 38% reflects the total real estate occupied by the 25th percentile most intolerant genes. P-values reflect binomial exact tests where the probability of success is adjusted to 0.38, accounting for the gene sizes of the 25% most intolerant genes.
Figure 6
Figure 6. 2D plots illustrating possible utility of RVIS in conjunction with a variant-level quantitative score (PolyPhen-2) across cohorts with proposed de novo mutation genetic architectures.
Plots reflect the single most damaging de novo missense mutation in individuals with at least one de novo missense mutation: [A] Controls (n = 247); [B] Severe ID (n = 67); [C] Epileptic Encephalopathies (n = 134); [D] Autism Spectrum Disorders (n = 412). Full lists of missense de novo mutations in the “hot zone” are available in Dataset S3, including loss of function SNV mutations (not plotted).

References

    1. Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A, et al. (2010) Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput Biol 6: e1001025. - PMC - PubMed
    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods 7: 248–249. - PMC - PubMed
    1. Lee W, Yue P, Zhang Z (2009) Analytical methods for inferring functional effects of single base pair substitutions in human cancers. Hum Genet 126: 481–498. - PMC - PubMed
    1. Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, et al. (2012) SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res 40: W452–457. - PMC - PubMed
    1. Hicks S, Wheeler DA, Plon SE, Kimmel M (2011) Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum Mutat 32: 661–668. - PMC - PubMed

Publication types