Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;14 Suppl 3(Suppl 3):S6.
doi: 10.1186/1471-2164-14-S3-S6. Epub 2013 May 28.

WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

Affiliations

WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

Emidio Capriotti et al. BMC Genomics. 2013.

Abstract

Background: SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases.

Results: The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO(3d) programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively.

Conclusions: WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic view of SNPs&GO (panel A) and SNPs&GO3d (panel B). From the left to the right, the SNPs&GO and SNPs&GO3d input web pages, the flow chart of the sequence and structure-based methods and two examples of the returned outputs.
Figure 2
Figure 2
Performance of SNP&GO and SNPs&GO3d on the SAP-NEW dataset (DB). In panel A the ROC curves of both methods are shown. In panels B and C the performances of SNP&GO and SNPs&GO3d as a function of the Reliability index (RI) are reported.

References

    1. Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ, Altman RB. Bioinformatics challenges for personalized medicine. Bioinformatics. 2011;27(13):1741–1748. doi: 10.1093/bioinformatics/btr295. - DOI - PMC - PubMed
    1. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–945. doi: 10.1038/nature03001. - DOI - PubMed
    1. International HapMap Consortium. The International HapMap Project. Nature. 2003;426(6968):789–796. doi: 10.1038/nature02168. - DOI - PubMed
    1. Cotton RG, Auerbach AD, Axton M, Barash CI, Berkovic SF, Brookes AJ, Burn J, Cutting G, den Dunnen JT, Flicek P. et al.GENETICS. The Human Variome Project. Science. 2008;322(5903):861–862. doi: 10.1126/science.1167363. - DOI - PMC - PubMed
    1. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061–1073. doi: 10.1038/nature09534. - DOI - PMC - PubMed

Publication types