Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 23;25(6):bbae504.
doi: 10.1093/bib/bbae504.

MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization

Affiliations

MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization

Yun Zuo et al. Brief Bioinform. .

Abstract

Subcellular localization of messenger ribonucleic acid (mRNA) is a universal mechanism for precise and efficient control of the translation process. Although many computational methods have been constructed by researchers for predicting mRNA subcellular localization, very few of these computational methods have been designed to predict subcellular localization with multiple localization annotations, and their generalization performance could be improved. In this study, the prediction model MSlocPRED was constructed to identify multi-label mRNA subcellular localization. First, the preprocessed Dataset 1 and Dataset 2 are transformed into the form of images. The proposed MDNDO-SMDU resampling technique is then used to balance the number of samples in each category in the training dataset. Finally, deep transfer learning was used to construct the predictive model MSlocPRED to identify subcellular localization for 16 classes (Dataset 1) and 18 classes (Dataset 2). The results of comparative tests of different resampling techniques show that the resampling technique proposed in this study is more effective in preprocessing for subcellular localization. The prediction results of the datasets constructed by intercepting different NC end (Both the 5' and 3' untranslated regions that flank the protein-coding sequence and influence mRNA function without encoding proteins themselves.) lengths show that for Dataset 1 and Dataset 2, the prediction performance is best when the NC end is intercepted by 35 nucleotides, respectively. The results of both independent testing and five-fold cross-validation comparisons with established prediction tools show that MSlocPRED is significantly better than established tools for identifying multi-label mRNA subcellular localization. Additionally, to understand how the MSlocPRED model works during the prediction process, SHapley Additive exPlanations was used to explain it. The predictive model and associated datasets are available on the following github: https://github.com/ZBYnb1/MSlocPRED/tree/main.

Keywords: deep transfer learning; interpretable analysis; sequence analysis; subcellular localization.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The framework diagram of the prediction model MSlocPRED constructed in this paper.
Figure 2
Figure 2
Transfer learning network.
Figure 3
Figure 3
Five-fold cross-validation results of different resampling methods for Dataset 1.
Figure 4
Figure 4
Five-fold cross-validation results of different resampling methods for Dataset 2.
Figure 5
Figure 5
Prediction results of intercepted sequences with different values at the NC end on Dataset 1.
Figure 6
Figure 6
Prediction results of intercepted sequences with different values at the NC end on Dataset 2.
Figure 7
Figure 7
Waterfall chart for category2 in Dataset 1.
Figure 8
Figure 8
Bar chart for category2 in Dataset 1.
Figure 9
Figure 9
Bee swarm plot for category2 in Dataset 1.
Figure 10
Figure 10
Waterfall chart for category8 in Dataset 2.
Figure 11
Figure 11
Bar chart for category8 in Dataset 2.
Figure 12
Figure 12
Bee swarm plot for category8 in Dataset 2.

Similar articles

References

    1. Buxbaum AR, Haimovich G, Singer RH. In the right place at the right time: visualizing and understanding mRNA localization. Nat Rev Mol Cell Biol 2015;16:95–109. 10.1038/nrm3918. - DOI - PMC - PubMed
    1. Lashkevich KA, Dmitriev SE. mRNA targeting, transport and local translation in eukaryotic cells: from the classical view to a diversity of new concepts. Mol Biol 2021;55:507–37. 10.1134/S0026893321030080. - DOI - PMC - PubMed
    1. Ross J. mRNA stability in mammalian cells. Microbiol Rev 1995;59:423–50. 10.1128/mr.59.3.423-450.1995. - DOI - PMC - PubMed
    1. Wang R, Jiang Y, Jin J. et al. . DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis. Nucleic Acids Res 2023;51:3017–29. 10.1093/nar/gkad055. - DOI - PMC - PubMed
    1. Cheng H, Rao B, Liu L. et al. . PepFormer: end-to-end transformer-based siamese network to predict and enhance peptide detectability based on sequence only. Anal Chem 2021;93:6481–90. 10.1021/acs.analchem.1c00354. - DOI - PubMed