Going from where to why--interpretable prediction of protein subcellular localization
- PMID: 20299325
- PMCID: PMC2859129
- DOI: 10.1093/bioinformatics/btq115
Going from where to why--interpretable prediction of protein subcellular localization
Abstract
Motivation: Protein subcellular localization is pivotal in understanding a protein's function. Computational prediction of subcellular localization has become a viable alternative to experimental approaches. While current machine learning-based methods yield good prediction accuracy, most of them suffer from two key problems: lack of interpretability and dealing with multiple locations.
Results: We present YLoc, a novel method for predicting protein subcellular localization that addresses these issues. Due to its simple architecture, YLoc can identify the relevant features of a protein sequence contributing to its subcellular localization, e.g. localization signals or motifs relevant to protein sorting. We present several example applications where YLoc identifies the sequence features responsible for protein localization, and thus reveals not only to which location a protein is transported to, but also why it is transported there. YLoc also provides a confidence estimate for the prediction. Thus, the user can decide what level of error is acceptable for a prediction. Due to a probabilistic approach and the use of several thousands of dual-targeted proteins, YLoc is able to predict multiple locations per protein. YLoc was benchmarked using several independent datasets for protein subcellular localization and performs on par with other state-of-the-art predictors. Disregarding low-confidence predictions, YLoc can achieve prediction accuracies of over 90%. Moreover, we show that YLoc is able to reliably predict multiple locations and outperforms the best predictors in this area.
Availability: www.multiloc.org/YLoc.
Figures
Similar articles
-
YLoc--an interpretable web server for predicting subcellular localization.Nucleic Acids Res. 2010 Jul;38(Web Server issue):W497-502. doi: 10.1093/nar/gkq477. Epub 2010 May 27. Nucleic Acids Res. 2010. PMID: 20507917 Free PMC article.
-
Enhancing membrane protein subcellular localization prediction by parallel fusion of multi-view features.IEEE Trans Nanobioscience. 2012 Dec;11(4):375-85. doi: 10.1109/TNB.2012.2208473. Epub 2012 Aug 3. IEEE Trans Nanobioscience. 2012. PMID: 22875262
-
Prediction of protein subcellular localization.Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018. Proteins. 2006. PMID: 16752418
-
Predicting multisite protein subcellular locations: progress and challenges.Expert Rev Proteomics. 2013 Jun;10(3):227-37. doi: 10.1586/epr.13.16. Expert Rev Proteomics. 2013. PMID: 23777214 Review.
-
Prediction of subcellular locations of proteins: where to proceed?Proteomics. 2010 Nov;10(22):3970-83. doi: 10.1002/pmic.201000274. Epub 2010 Nov 2. Proteomics. 2010. PMID: 21080490 Review.
Cited by
-
Protein localization prediction using random walks on graphs.BMC Bioinformatics. 2013;14 Suppl 8(Suppl 8):S4. doi: 10.1186/1471-2105-14-S8-S4. Epub 2013 May 9. BMC Bioinformatics. 2013. PMID: 23815126 Free PMC article.
-
Evidence for the localization of the Arabidopsis cytokinin receptors AHK3 and AHK4 in the endoplasmic reticulum.J Exp Bot. 2011 Nov;62(15):5571-80. doi: 10.1093/jxb/err238. Epub 2011 Aug 12. J Exp Bot. 2011. PMID: 21841169 Free PMC article.
-
Prolonged Expression of a Putative Invertase Inhibitor in Micropylar Endosperm Suppressed Embryo Growth in Arabidopsis.Front Plant Sci. 2018 Jan 30;9:61. doi: 10.3389/fpls.2018.00061. eCollection 2018. Front Plant Sci. 2018. PMID: 29441087 Free PMC article.
-
Paired Expression Analysis of Tumor Cell Surface Antigens.Front Oncol. 2017 Aug 21;7:173. doi: 10.3389/fonc.2017.00173. eCollection 2017. Front Oncol. 2017. PMID: 28871274 Free PMC article.
-
Protein Sorting Prediction.Methods Mol Biol. 2024;2715:27-63. doi: 10.1007/978-1-0716-3445-5_2. Methods Mol Biol. 2024. PMID: 37930519
References
-
- Bannai H, et al. Extensive feature detection of N-terminal protein sorting signals. Bioinformatics. 2002;18:298–305. - PubMed
-
- Boden M, Hawkins J. Prediction of subcellular localization using sequence-biased recurrent networks. Bioinformatics. 2005;21:2279–2286. - PubMed
-
- Brady S, Shatkay H. Pacific Symposium on Biocomputing. World Scientific; 2008. EpiLoc: a (working) text-based system for predicting protein subcellular location; pp. 604–615. - PubMed
-
- Briesemeister S, et al. SherLoc2: a high-accuracy hybrid method for predicting protein subcellular localization. J. Proteome Res. 2009;8:5363–5366. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources