Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 3;8(10):3321-3329.
doi: 10.1534/g3.118.200563.

Comparative Genomics Approaches Accurately Predict Deleterious Variants in Plants

Affiliations

Comparative Genomics Approaches Accurately Predict Deleterious Variants in Plants

Thomas J Y Kono et al. G3 (Bethesda). .

Abstract

Recent advances in genome resequencing have led to increased interest in prediction of the functional consequences of genetic variants. Variants at phylogenetically conserved sites are of particular interest, because they are more likely than variants at phylogenetically variable sites to have deleterious effects on fitness and contribute to phenotypic variation. Numerous comparative genomic approaches have been developed to predict deleterious variants, but the approaches are nearly always assessed based on their ability to identify known disease-causing mutations in humans. Determining the accuracy of deleterious variant predictions in nonhuman species is important to understanding evolution, domestication, and potentially to improving crop quality and yield. To examine our ability to predict deleterious variants in plants we generated a curated database of 2,910 Arabidopsis thaliana mutants with known phenotypes. We evaluated seven approaches and found that while all performed well, their relative ranking differed from prior benchmarks in humans. We conclude that deleterious mutations can be reliably predicted in A. thaliana and likely other plant species, but that the relative performance of various approaches does not necessarily translate from one species to another.

Keywords: deleterious mutations; genome; phenotypes; training set.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of approaches that distinguish deleterious and neutral amino acid substitutions. The fraction of true positives (sensitivity) vs. the fraction of true negatives (specificity) is shown for seven approaches (LRTm is a masked version of LRT, PPH2 is PolyPhen2). The curves are based on 2,910 deleterious variants and 1,583 neutral variants. Vertical and horizontal dashed lines show the cutoff at 95% specificity and 95% sensitivity, respectively.
Figure 2
Figure 2
The proportion of SNPs called deleterious across frequency classes. The fraction of SNPs called deleterious by each approach (legend) at its 95% specificity threshold across five frequency classes, labeled by the number of minor alleles present (n = 80). The minor allele is defined as the allele that is less frequent in the sample. Sample sizes for the five classes are 5,303 (1), 1,646 (2), 1,250 (3-4), 1,015 (5-8) and 1,583 (>8).
Figure 3
Figure 3
Performance of approaches across different classes of sites. Performance is measured by the area under the curve (AUC) of the approach’s sensitivity vs. specificity. A – comparison of mutants with biochemical (n = 1,000) vs. gross phenotypes (n = 1,910). B – comparison of performance for substitutions in duplicated (n = 2,098) vs. single copy genes (n = 2,395). Confidence intervals were determined by 2,000 bootstrapping iterations.
Figure 4
Figure 4
Dissimilarities among approaches. Dissimilarities were computed by the pairwise number of disagreements between each approach applied to mutants and common SNPs (n = 4,493). Dissimilarities are represented by a tree based on hierarchical clustering and values below nodes are bootstrap support based on 2,000 iterations.

Similar articles

Cited by

References

    1. 1000 Genomes Project Consortium, G. R. Abecasis, Auton A., Brooks L. D., DePristo M. A., et al. , 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65. 10.1038/nature11632 - DOI - PMC - PubMed
    1. Adzhubei I., Jordan D. M., Sunyaev S. R., 2013. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. Editor. Board Jonathan Haines Al 0 7: Unit7.20 10.1002/0471142905.hg0720s76 - DOI - PubMed
    1. Adzhubei I. A., Schmidt S., Peshkin L., Ramensky V. E., Gerasimova A., et al. , 2010. A method and server for predicting damaging missense mutations. Nat. Methods 7: 248–249. 10.1038/nmeth0410-248 - DOI - PMC - PubMed
    1. Ahituv N., Kavaslar N., Schackwitz W., Ustaszewska A., Martin J., et al. , 2007. Medical sequencing at the extremes of human body mass. Am. J. Hum. Genet. 80: 779–791. 10.1086/513471 - DOI - PMC - PubMed
    1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J., 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410. 10.1016/S0022-2836(05)80360-2 - DOI - PubMed

Publication types