Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 29;41(4):btaf127.
doi: 10.1093/bioinformatics/btaf127.

RNALoc-LM: RNA subcellular localization prediction using pre-trained RNA language model

Affiliations

RNALoc-LM: RNA subcellular localization prediction using pre-trained RNA language model

Min Zeng et al. Bioinformatics. .

Abstract

Motivation: Accurately predicting RNA subcellular localization is crucial for understanding the cellular functions and regulatory mechanisms of RNAs. Although many computational methods have been developed to predict the subcellular localization of lncRNAs, miRNAs, and circRNAs, very few of them are designed to simultaneously predict the subcellular localization of multiple types of RNAs. In addition, the emergence of pre-trained RNA language model has shown remarkable performance in various bioinformatics tasks, such as structure prediction and functional annotation. Despite these advancements, there remains a significant gap in applying pre-trained RNA language models specifically for predicting RNA subcellular localization.

Results: In this study, we proposed RNALoc-LM, the first interpretable deep-learning framework that leverages a pre-trained RNA language model for predicting RNA subcellular localization. RNALoc-LM uses a pre-trained RNA language model to encode RNA sequences, then captures local patterns and long-range dependencies through TextCNN and BiLSTM modules. A multi-head attention mechanism is used to focus on important regions within the RNA sequences. The results demonstrate that RNALoc-LM significantly outperforms both deep-learning baselines and existing state-of-the-art predictors. Additionally, motif analysis highlights RNALoc-LM's potential for discovering important motifs, while an ablation study confirms the effectiveness of the RNA sequence embeddings generated by the pre-trained RNA language model.

Availability and implementation: The RNALoc-LM web server is available at http://csuligroup.com:8000/RNALoc-LM. The source code can be obtained from https://github.com/CSUBioGroup/RNALoc-LM.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Architecture of RNALoc-LM. RNALoc-LM uses the pre-trained RNA language model RNA-FM as its embedding module. By inputting the RNA sequence into this embedding module, an embedding matrix is generated. The TextCNN network is then used to extract local patterns from the embeddings. Following this, a BiLSTM module captures long-range dependencies and contextual information. Additionally, RNALoc-LM incorporates a multi-head attention mechanism to focus on the important segments of the RNA sequence. Finally, a fully connected layer is used to perform the RNA subcellular localization prediction task.
Figure 2.
Figure 2.
Performance comparison of RNALoc-LM with existing predictors on the independent test set. (a) Comparison of RNALoc-LM with five predictors for predicting lncRNA subcellular localization. (b) Comparison of RNALoc-LM with two predictors for predicting miRNA subcellular localization. (c) Comparison of RNALoc-LM with two predictors for predicting circRNA subcellular localization.
Figure 3.
Figure 3.
Motifs identified by the MEME suite (left) and by RNALoc-LM (middle). The right column displays the E-values for the motifs discovered by the MEME suite. (a) Identified lncRNA motifs. (b) Identified miRNA motifs. (c) Identified circRNA motifs.
Figure 4.
Figure 4.
RNALoc-LM captures known motifs associated with subcellular localization in different RNA types. (a) RNALoc-LM captures the motifs “AGCCC” and “RCCUCCC,” both associated with nuclear localization in lncRNAs. (b) RNALoc-LM captures the “AGUGUU” motif in miRNAs, associated with nuclear localization. (c) RNALoc-LM captures the “GAUGAA” motif in circRNAs, which is associated with nucleus localization.
Figure 5.
Figure 5.
Prediction results of three representative RNA samples from using RNALoc-LM. (a) Prediction results for lncRNA LOC654780. (b) Prediction results for miRNA hsa-mir-653. (c) Prediction results for circRNA hsa_circ_0122817.

Similar articles

Cited by

References

    1. Ahmad A, Lin H, Shatabda S. Locate-R: subcellular localization of long non-coding RNAs using nucleotide compositions. Genomics 2020;112:2583–9. - PubMed
    1. Asim MN, Ibrahim MA, Imran Malik M et al. Circ-LocNet: a computational framework for circular RNA sub-cellular localization prediction. Int J Mol Sci 2022;23:8221. - PMC - PubMed
    1. Asim MN, Ibrahim MA, Zehe C et al. L2S-MirLoc: a lightweight two stage MiRNA sub-cellular localization prediction framework. In: 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China. New York, USA: IEEE, 2021, 1–8.
    1. Asim MN, Malik MI, Zehe C et al. MirLocPredictor: a ConvNet-based multi-label MicroRNA subcellular localization predictor by incorporating k-mer positional information. Genes 2020;11:1475. - PMC - PubMed
    1. Bai T, Yan K, Liu B. DAmiRLocGNet: miRNA subcellular localization prediction by combining miRNA–disease associations and graph convolutional networks. Brief Bioinf 2023;24:bbad212. - PubMed