Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 1;43(W1):W141-7.
doi: 10.1093/nar/gkv461. Epub 2015 May 15.

SIFTER search: a web server for accurate phylogeny-based protein function prediction

Affiliations

SIFTER search: a web server for accurate phylogeny-based protein function prediction

Sayed M Sahraeian et al. Nucleic Acids Res. .

Abstract

We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access to precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. The SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phylogeny-based protein function prediction with SIFTER. The reconciliation distinguishes duplication from speciation at each internal node. Colors indicate functions.
Figure 2.
Figure 2.
(A) The ROC-like comparison of SIFTER with BLAST for the Nudix family of proteins. SIFTER consistently dominates BLAST annotations in this family. (Figure adapted from (8)). (B) The CAFA precision-recall analysis of SIFTER, BLAST and naïve weighted random. (Data provided by CAFA2 analyst, Jiang Yuxiang).
Figure 3.
Figure 3.
Sample output for searching SIFTER predictions by protein ID. Results are shown for protein PA24B_MOUSE.
Figure 4.
Figure 4.
Sample output for searching SIFTER predictions for homologs of a given sequence.
Figure 5.
Figure 5.
Estimating SIFTER processing time for a sample Pfam family (PF00735) with 11 candidate molecular functions and family size 2609.

References

    1. Wass M.N., Sternberg M.J.E. ConFunc-functional annotation in the twilight zone. Bioinformatics. 2008;24:798–806. - PubMed
    1. Martin D.M., Berriman M., Barton G.J. GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes. BMC Bioinformatics. 2004;5:178. - PMC - PubMed
    1. Hawkins T., Luban S., Kihara D. Enhanced automated function prediction using distantly related sequences and contextual association by PFP. Protein Sci. 2006;15:1550–1556. - PMC - PubMed
    1. Clark W.T., Radivojac P. Analysis of protein function and its prediction from amino acid sequence. Proteins. 2011;79:2086–2096. - PubMed
    1. Pazos F., Sternberg M.J. Automated prediction of protein function and detection of functional sites from structure. Proc. Natl. Acad. Sci. U.S.A. 2004;101:14754–14759. - PMC - PubMed

Publication types