Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 30;22(5):384-390.
doi: 10.2174/1389202922666211011143008.

Splice Junction Identification using Long Short-Term Memory Neural Networks

Affiliations

Splice Junction Identification using Long Short-Term Memory Neural Networks

Kevin Regan et al. Curr Genomics. .

Abstract

Background: Splice junctions are the key to move from pre-messenger RNA to mature messenger RNA in many multi-exon genes due to alternative splicing. Since the percentage of multi-exon genes that undergo alternative splicing is very high, identifying splice junctions is an attractive research topic with important implications.

Objective: The aim of this paper is to develop a deep learning model capable of identifying splice junctions in RNA sequences using 13,666 unique sequences of primate RNA.

Methods: A Long Short-Term Memory (LSTM) Neural Network model is developed that classifies a given sequence as EI (Exon-Intron splice), IE (Intron-Exon splice), or N (No splice). The model is trained with groups of trinucleotides and its performance is tested using validation and test data to prevent bias.

Results: Model performance was measured using accuracy and f-score in test data. The finalized model achieved an average accuracy of 91.34% with an average f-score of 91.36% over 50 runs.

Conclusion: Comparisons show a highly competitive model to recent Convolutional Neural Network structures. The proposed LSTM model achieves the highest accuracy and f-score among published alternative LSTM structures.

Keywords: LSTM; RNA-seq; Splice junction; classification; deep learning; neural networks.

PubMed Disclaimer

Figures

Fig. (1)
Fig. (1)
Pre-mRNA spliced into mature mRNA (figure obtained from https://www.khanacademy.org). (A higher resolution / colour version of this figure is available in the electronic copy of the article).
Fig. (2)
Fig. (2)
Outline of the model development process. (A higher resolution / colour version of this figure is available in the electronic copy of the article).
Fig. (3)
Fig. (3)
Outline of the developed deep learning model. (A higher resolution / colour version of this figure is available in the electronic copy of the article).
Fig. (4)
Fig. (4)
Distribution of accuracy and f-score in test set for 50 runs of the developed LSTM model (in blue) and a CNN model for comparison (in gray). (A higher resolution / colour version of this figure is available in the electronic copy of the article).

Similar articles

Cited by

References

    1. Lu Z.X., Jiang P., Xing Y. Genetic variation of pre-mRNA alternative splicing in human populations. Wiley Interdiscip. Rev. RNA. 2012;3(4):581–592. doi: 10.1002/wrna.120. - DOI - PMC - PubMed
    1. Ding L., Rath E., Bai Y. Comparison of alternative splicing junction detection tools using RNA-seq data. Curr. Genomics. 2017;18(3):268–277. doi: 10.2174/1389202918666170215125048. - DOI - PMC - PubMed
    1. Mapleson D., Venturini L., Kaithakottil G., Swarbreck D. Efficient and accurate detection of splice junctions from RNA-seq with Portcullis. Gigascience. 2018;7(12):1–11. doi: 10.1093/gigascience/giy131. - DOI - PMC - PubMed
    1. Zhang Y., Liu X., MacLeod J., Liu J. Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach. BMC Genomics. 2018;19(1):971. doi: 10.1186/s12864-018-5350-1. - DOI - PMC - PubMed
    1. Zuallaert J., Godin F., Kim M., Soete A., Saeys Y., De Neve W. SpliceRover: interpretable convolutional neural networks for improved splice site prediction. Bioinformatics. 2018;34(24):4180–4188. doi: 10.1093/bioinformatics/bty497. - DOI - PubMed

LinkOut - more resources