SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning
- PMID: 33388743
- DOI: 10.1093/bib/bbaa401
SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning
Abstract
Motivation: mRNA location corresponds to the location of protein translation and contributes to precise spatial and temporal management of the protein function. However, current assignment of subcellular localization of eukaryotic mRNA reveals important limitations: (1) turning multiple classifications into multiple dichotomies makes the training process tedious; (2) the majority of the models trained by classical algorithm are based on the extraction of single sequence information; (3) the existing state-of-the-art models have not reached an ideal level in terms of prediction and generalization ability. To achieve better assignment of subcellular localization of eukaryotic mRNA, a better and more comprehensive model must be developed.
Results: In this paper, SubLocEP is proposed as a two-layer integrated prediction model for accurate prediction of the location of sequence samples. Unlike the existing models based on limited features, SubLocEP comprehensively considers additional feature attributes and is combined with LightGBM to generated single feature classifiers. The initial integration model (single-layer model) is generated according to the categories of a feature. Subsequently, two single-layer integration models are weighted (sequence-based: physicochemical properties = 3:2) to produce the final two-layer model. The performance of SubLocEP on independent datasets is sufficient to indicate that SubLocEP is an accurate and stable prediction model with strong generalization ability. Additionally, an online tool has been developed that contains experimental data and can maximize the user convenience for estimation of subcellular localization of eukaryotic mRNA.
Keywords: LightGBM; ensemble model; feature extraction; subcellular localization of eukaryotic mRNA.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Similar articles
-
LncLocation: Efficient Subcellular Location Prediction of Long Non-Coding RNA-Based Multi-Source Heterogeneous Feature Fusion.Int J Mol Sci. 2020 Oct 1;21(19):7271. doi: 10.3390/ijms21197271. Int J Mol Sci. 2020. PMID: 33019721 Free PMC article.
-
PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.Brief Bioinform. 2021 Nov 5;22(6):bbab278. doi: 10.1093/bib/bbab278. Brief Bioinform. 2021. PMID: 34337652 Free PMC article.
-
mRNALocater: Enhance the prediction accuracy of eukaryotic mRNA subcellular localization by using model fusion strategy.Mol Ther. 2021 Aug 4;29(8):2617-2623. doi: 10.1016/j.ymthe.2021.04.004. Epub 2021 Apr 3. Mol Ther. 2021. PMID: 33823302 Free PMC article.
-
[Mechanism of 5'-to-3' degradation of eukaryotic and prokaryotic mRNA].Yi Chuan. 2015 Mar;37(3):250-258. doi: 10.16288/j.yczz.14-368. Yi Chuan. 2015. PMID: 25786999 Review. Chinese.
-
So close, no matter how far: multiple paths connecting transcription to mRNA translation in eukaryotes.EMBO Rep. 2020 Sep 3;21(9):e50799. doi: 10.15252/embr.202050799. Epub 2020 Aug 16. EMBO Rep. 2020. PMID: 32803873 Free PMC article. Review.
Cited by
-
EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction.Comput Struct Biotechnol J. 2022 Jul 26;20:3986-4002. doi: 10.1016/j.csbj.2022.07.031. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35983235 Free PMC article.
-
DeepmRNALoc: A Novel Predictor of Eukaryotic mRNA Subcellular Localization Based on Deep Learning.Molecules. 2023 Mar 1;28(5):2284. doi: 10.3390/molecules28052284. Molecules. 2023. PMID: 36903531 Free PMC article.
-
A review from biological mapping to computation-based subcellular localization.Mol Ther Nucleic Acids. 2023 Apr 20;32:507-521. doi: 10.1016/j.omtn.2023.04.015. eCollection 2023 Jun 13. Mol Ther Nucleic Acids. 2023. PMID: 37215152 Free PMC article. Review.
-
EDCLoc: a prediction model for mRNA subcellular localization using improved focal loss to address multi-label class imbalance.BMC Genomics. 2024 Dec 27;25(1):1252. doi: 10.1186/s12864-024-11173-6. BMC Genomics. 2024. PMID: 39731012 Free PMC article.
-
lncRNA localization and feature interpretability analysis.Mol Ther Nucleic Acids. 2024 Dec 12;36(1):102425. doi: 10.1016/j.omtn.2024.102425. eCollection 2025 Mar 11. Mol Ther Nucleic Acids. 2024. PMID: 39926317 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources