Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Sep 15;112(37):E5189-98.
doi: 10.1073/pnas.1511585112. Epub 2015 Aug 12.

Comparison of predicted and actual consequences of missense mutations

Affiliations
Comparative Study

Comparison of predicted and actual consequences of missense mutations

Lisa A Miosge et al. Proc Natl Acad Sci U S A. .

Abstract

Each person's genome sequence has thousands of missense variants. Practical interpretation of their functional significance must rely on computational inferences in the absence of exhaustive experimental measurements. Here we analyzed the efficacy of these inferences in 33 de novo missense mutations revealed by sequencing in first-generation progeny of N-ethyl-N-nitrosourea-treated mice, involving 23 essential immune system genes. PolyPhen2, SIFT, MutationAssessor, Panther, CADD, and Condel were used to predict each mutation's functional importance, whereas the actual effect was measured by breeding and testing homozygotes for the expected in vivo loss-of-function phenotype. Only 20% of mutations predicted to be deleterious by PolyPhen2 (and 15% by CADD) showed a discernible phenotype in individual homozygotes. Half of all possible missense mutations in the same 23 immune genes were predicted to be deleterious, and most of these appear to become subject to purifying selection because few persist between separate mouse substrains, rodents, or primates. Because defects in immune genes could be phenotypically masked in vivo by compensation and environment, we compared inferences by the same tools with the in vitro phenotype of all 2,314 possible missense variants in TP53; 42% of mutations predicted by PolyPhen2 to be deleterious (and 45% by CADD) had little measurable consequence for TP53-promoted transcription. We conclude that for de novo or low-frequency missense mutations found by genome sequencing, half those inferred as deleterious correspond to nearly neutral mutations that have little impact on the clinical phenotype of individual cases but will nevertheless become subject to purifying selection.

Keywords: cancer; de novo mutation; evolution; immunodeficiency; nearly neutral.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Spectrum of functional consequences predicted by PolyPhen2 for different sources of missense variants in 23 essential immune system genes. PolyPhen2 scores were calculated for the following sets of missense variants: All possible, the complete set of 89,887 possible amino acid substitutions caused by single missense changes in the 23 mouse immune system genes listed in Table 1; ENU observed, 388 mostly unphenotyped, de novo mutations found in the 23 genes by exome sequencing of 2,081 G1 mice; ENU phenotype selected, mutations in the 23 genes discovered by flow cytometric screening of thousands of G3 offspring of ENU-exposed mice; Mouse-rat, missense variants between the C57BL/6J mouse and Brown Norway rat genome sequences; Mouse taxa, missense variants between the four wild-derived, inbred strains representing Mus spretus, Mus musculus musculus, Mus musculus domesticus, and Mus musculus castaneus; Mouse Lab strains, missense variants between the genomes of 14 inbred laboratory mouse strains; B6 mouse substrains, missense variants between the genomes of inbred strains C57BL/6J and C57BL/6N; Human-Chimp, missense variants in the orthologous 23 immune genes between the reference human and chimpanzee genome; Human variation, missense variants in the same immune genes detected by population-scale human exome sequencing (36). The barplots in gray indicate the number of variants present in each set. Purple violin plots are kernel density plots representing the distribution of PolyPhen2 scores for each variant set. The white dot indicates the median PolyPhen2 value for each set. The black bars are box-and-whisker plots: thick black bar extends from the first to the third quartile; thin black lines extend to the lowest and highest data points within 1.5 times the interquartile range.
Fig. 2.
Fig. 2.
Comparison of computationally predicted damage with experimentally measured activity of TP53. (A–C) Analysis of all 2,314 amino acid substitutions in human TP53 that can arise from a single nucleotide substitution. The transcription-enhancing activity of each mutant, acting on the p21WAF1 target sequence in yeast (37), is shown on the y axis normalized to the activity of WT TP53 and multiplied by 100. Predicted damage for each possible mutation is plotted on the x axis, calculated by (A) PolyPhen2; (B) MutationAssessor; and (C) CADD. Regions marked with dashed lines denote true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). (D–F) The same analyses performed on 1,191 somatically acquired TP53 missense mutations from human cancers in the TP53 database. (G–I) The same analyses of 172 germ-line TP53 missense mutations found by clinical sequencing, primarily performed when the clinical phenotype was consistent with a germ-line TP53 mutation.
Fig. S1.
Fig. S1.
Analysis of PolyPhen2 predicted damage scores compared with the measured transactivation activity of all possible TP53 missense mutations using binding sequences from eight different TP53 target genes. The target binding sequences used in each transactivation assay are (A) p21WAF1, (B) MDM2, (C) BAX, (D) 14–3-3σ, (E) p53AIP1, (F) GADD45, (G) Noxa, and (H) p53R2.
Fig. 3.
Fig. 3.
Distribution of measured TP53 transactivation activity from true-negative and false-positive TP53 mutations. From all possible TP53 missense mutations, the distribution of measured transcriptional activity values is shown for those inferred as true negative (TN, CADD score < 5 and transactivation activity >50; blue, n = 322) or as false positive (FP, CADD score > 20 and transactivation activity >50; red, n = 573). The mean normalized activity was 108.8 for the TN set and 93.4 for the FP set. The distribution of values for each set is unlikely to be the same (Kolmogorov–Smirnov D = 0.3427 P < 2.2e−16; Wilcox W = 57,261.5, P < 2.2e−16).
Fig. S2.
Fig. S2.
Distributions of residual TP53 transactivation activities of PolyPhen2-, MutationAssessor-, and CADD-assigned true-negative and false-positive mutations. Measured TP53 transactivation activity for true-negative (TN; blue) and false-positive (FP; red) categories of TP53 missense mutations identified in Fig. 2 A–C using inference scores calculated with (A) PolyPhen2, (B) MutationAssessor, and (C) CADD. For PolyPhen2, the mean normalized activity was 106.1 for the TN set and 94.6 for the FP set (Kolmogorov–Smirnov D = 0.2385, P = 2.454e−14; Wilcox W = 210,343.5, P = 3.541e−16). For MutationAssessor, the mean normalized activity was 104.1 for the TN set and 95.4 for the FP set (Kolmogorov–Smirnov D = 0.1792, P = 2.204e−10; Wilcox W = 327,022.5, P = 3.364e−12). For CADD, when intermediate values (score >5 and <20) are included with the TN category, the mean normalized activity was 105.3 for the TN set and 93.4 for the FP set (Kolmogorov–Smirnov D = 0.2401, P < 2.2e−16; Wilcox W = 190,543.5, P < 2.2e−16).
Fig. S3.
Fig. S3.
Incidence and location of apparent false-positive TP53 somatic mutations. (A) Unique TP53 somatic mutations found in human cancers, separated into categories as shown in Fig. 2D and ranked by the number of independent occurrences in separate cancer cases. Red, true positive; green, true negative; blue, false positive; purple, false negative. (B) The ideogram shows the major functional domains of the TP53 protein. Plotted against this are the number of false-positive mutations in 20 amino acid bins: from the set of all possible TP53 mutations, those that were predicted to be deleterious with Polyphen2 score >0.8 yet actually retained >40% of WT transactivation activity. Unlike most oncogenic TP53 mutations that occur in the DNA-binding domain (red), false-positive mutations appear concentrated in the extremities of the transactivation and tetramerization domains.

Comment in

References

    1. MacArthur DG, et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012;335(6070):823–828. - PMC - PubMed
    1. Andrews TD, Sjollema G, Goodnow CC. Understanding the immunological impact of the human mutation explosion. Trends Immunol. 2013;34(3):99–106. - PubMed
    1. Gnad F, Baucom A, Mukhyala K, Manning G, Zhang Z. Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genomics. 2013;14(Suppl 3):S7. - PMC - PubMed
    1. Thusberg J, Olatubosun A, Vihinen M. Performance of mutation pathogenicity prediction methods on missense variants. Hum Mutat. 2011;32(4):358–368. - PubMed
    1. Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum Mutat. 2011;32(6):661–668. - PMC - PubMed

Publication types