Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul;42(Web Server issue):W350-5.
doi: 10.1093/nar/gku396. Epub 2014 May 21.

LocTree3 prediction of localization

Affiliations

LocTree3 prediction of localization

Tatyana Goldberg et al. Nucleic Acids Res. 2014 Jul.

Abstract

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18=80±3% for eukaryotes and a six-state accuracy Q6=89±4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Reliable predictions more accurate. The reliability index (RI) of LocTree3 relates the strength of a prediction to the performance. The curves show the percentage accuracy/coverage (‘Materials and Methods’ section) for LocTree3 predictions above a given RI. Increasing the RI implies that we look at some subset of all predictions; the subset is given by the curves with squares. For instance, half of all eukaryotic proteins are predicted at RI > 70 (black cross-line). For this top 50%, performance rises from the average Q18 = 80% to Q18 = 95% (black line with circles, black arrow). Similar values are reached for RI > 80 for bacteria (gray cross-line; note that in this case Q6 = 95% is a six-state accuracy as opposed to the 18-state value for eukaryotes).
Figure 2.
Figure 2.
Example output for protein RP9_HUMAN. For every input protein sequence the LocTree3 prediction result contains: (i) protein identifier, (ii) reliability index, (iii) expected accuracy of the prediction, (iv) localization class, (v) GO term(s) and identifier(s) and (vi) source of the prediction. The predicted localization is highlighted in the schematic representation of the cell (here: nucleus). For LocTree2 predictions (shown here), we provide a visualization of the decision tree and the decision path leading to the final prediction. The reliability index is formed through the product of values along the decision path. For PSI-BLAST predictions, we provide a sequence alignment of the query protein to its best hit instead of the tree.

References

    1. Bairoch A., Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res. 1997;25:31–36. - PMC - PubMed
    1. Simpson J.C., Pepperkok R. Localizing the proteome. Genome Biol. 2003;4:240. - PMC - PubMed
    1. Huh W.K., Falvo J.V., Gerke L.C., Carroll A.S., Howson R.W., Weissman J.S., O'Shea E.K. Global analysis of protein localization in budding yeast. Nature. 2003;425:686–691. - PubMed
    1. Koonin E.V. Bridging the gap between sequence and function. Trends Genet. 2000;16:16. - PubMed
    1. Radivojac P., Clark W.T., Oron T.R., Schnoes A.M., Wittkop T., Sokolov A., Graim K., Funk C., Verspoor K., Ben-Hur A. A large-scale evaluation of computational protein function prediction. Nat. Methods. 2013;10 - PMC - PubMed

Publication types