Splice Junction Identification using Long Short-Term Memory Neural Networks
- PMID: 35283668
- PMCID: PMC8844938
- DOI: 10.2174/1389202922666211011143008
Splice Junction Identification using Long Short-Term Memory Neural Networks
Abstract
Background: Splice junctions are the key to move from pre-messenger RNA to mature messenger RNA in many multi-exon genes due to alternative splicing. Since the percentage of multi-exon genes that undergo alternative splicing is very high, identifying splice junctions is an attractive research topic with important implications.
Objective: The aim of this paper is to develop a deep learning model capable of identifying splice junctions in RNA sequences using 13,666 unique sequences of primate RNA.
Methods: A Long Short-Term Memory (LSTM) Neural Network model is developed that classifies a given sequence as EI (Exon-Intron splice), IE (Intron-Exon splice), or N (No splice). The model is trained with groups of trinucleotides and its performance is tested using validation and test data to prevent bias.
Results: Model performance was measured using accuracy and f-score in test data. The finalized model achieved an average accuracy of 91.34% with an average f-score of 91.36% over 50 runs.
Conclusion: Comparisons show a highly competitive model to recent Convolutional Neural Network structures. The proposed LSTM model achieves the highest accuracy and f-score among published alternative LSTM structures.
Keywords: LSTM; RNA-seq; Splice junction; classification; deep learning; neural networks.
© 2021 Bentham Science Publishers.
Figures




Similar articles
-
Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach.BMC Genomics. 2018 Dec 27;19(1):971. doi: 10.1186/s12864-018-5350-1. BMC Genomics. 2018. PMID: 30591034 Free PMC article.
-
Splice-site identification for exon prediction using bidirectional LSTM-RNN approach.Biochem Biophys Rep. 2022 May 26;30:101285. doi: 10.1016/j.bbrep.2022.101285. eCollection 2022 Jul. Biochem Biophys Rep. 2022. PMID: 35663929 Free PMC article.
-
An automated framework for evaluation of deep learning models for splice site predictions.Sci Rep. 2023 Jun 23;13(1):10221. doi: 10.1038/s41598-023-34795-4. Sci Rep. 2023. PMID: 37353532 Free PMC article.
-
Multiplexed primer extension sequencing: A targeted RNA-seq method that enables high-precision quantitation of mRNA splicing isoforms and rare pre-mRNA splicing intermediates.Methods. 2020 Apr 1;176:34-45. doi: 10.1016/j.ymeth.2019.05.013. Epub 2019 May 21. Methods. 2020. PMID: 31121301 Free PMC article. Review.
-
Unannotated splicing regulatory elements in deep intron space.Wiley Interdiscip Rev RNA. 2021 Sep;12(5):e1656. doi: 10.1002/wrna.1656. Epub 2021 Apr 22. Wiley Interdiscip Rev RNA. 2021. PMID: 33887804 Review.
Cited by
-
Advances in alternative splicing identification: deep learning and pantranscriptome.Front Plant Sci. 2023 Sep 18;14:1232466. doi: 10.3389/fpls.2023.1232466. eCollection 2023. Front Plant Sci. 2023. PMID: 37790793 Free PMC article.
References
LinkOut - more resources
Full Text Sources
Other Literature Sources