A scoping review on deep learning for next-generation RNA-Seq. data analysis
- PMID: 37084004
- DOI: 10.1007/s10142-023-01064-6
A scoping review on deep learning for next-generation RNA-Seq. data analysis
Abstract
In the last decade, transcriptome research adopting next-generation sequencing (NGS) technologies has gathered incredible momentum amongst functional genomics scientists, particularly amongst clinical/biomedical research groups. The progressive enfoldment/adoption of NGS technologies has incited an abundance of next-generation transcriptomic data harbouring an opulence of new knowledge in public databases. Nevertheless, knowledge discovery from these next-generation RNA-Seq. data analysis necessitates extensive bioinformatics know-how besides elaborate data analysis software packages consistent with the type and context of data analysis. Several reliability and reproducibility concerns continue to impede RNA-Seq. data analysis. Characteristic challenges comprise of data quality, hardware and networking provisions, selection and prioritisation of data analysis tools, and yet significantly implementing of robust machine learning algorithms for maximised exploitation of these experimental transcriptomic data. Over the years, numerous machine learning algorithms have been implemented for improved transcriptomic data analysis executing predominantly shallow learning approaches. More recently, deep learning algorithms are becoming more mainstream, and enactment for next-generation RNA-Seq. data analysis could be revolutionary in the coming years in the biomedical domain. In this scoping review, we attempt to determine the existing literature's size and potential nature in deep learning and NGS RNA-Seq. data analysis. An analysis of the contemporary topics of next-generation RNA-Seq. data analysis based on deep learning algorithms is critically reviewed, emphasising open-source resources.
Keywords: Data analysis; Deep learning; Functional genomics; Machine learning; NGS; Omics.
© 2023. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
Similar articles
-
A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis.Funct Integr Genomics. 2024 Aug 19;24(5):139. doi: 10.1007/s10142-024-01415-x. Funct Integr Genomics. 2024. PMID: 39158621 Review.
-
CIPHER: a flexible and extensive workflow platform for integrative next-generation sequencing data analysis and genomic regulatory element prediction.BMC Bioinformatics. 2017 Aug 8;18(1):363. doi: 10.1186/s12859-017-1770-1. BMC Bioinformatics. 2017. PMID: 28789639 Free PMC article.
-
Machine learning random forest for predicting oncosomatic variant NGS analysis.Sci Rep. 2021 Nov 8;11(1):21820. doi: 10.1038/s41598-021-01253-y. Sci Rep. 2021. PMID: 34750410 Free PMC article.
-
Why Deep Learning Is Changing the Way to Approach NGS Data Processing: A Review.IEEE Rev Biomed Eng. 2018;11:68-76. doi: 10.1109/RBME.2018.2825987. Epub 2018 Apr 12. IEEE Rev Biomed Eng. 2018. PMID: 29993643 Review.
-
Library construction for next-generation sequencing: overviews and challenges.Biotechniques. 2014 Feb 1;56(2):61-4, 66, 68, passim. doi: 10.2144/000114133. eCollection 2014. Biotechniques. 2014. PMID: 24502796 Free PMC article. Review.
Cited by
-
HiOmics: A cloud-based one-stop platform for the comprehensive analysis of large-scale omics data.Comput Struct Biotechnol J. 2024 Jan 5;23:659-668. doi: 10.1016/j.csbj.2024.01.002. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38292471 Free PMC article.
-
Improved meta-analysis pipeline ameliorates distinctive gene regulators of diabetic vasculopathy in human endothelial cell (hECs) RNA-Seq data.PLoS One. 2023 Nov 9;18(11):e0293939. doi: 10.1371/journal.pone.0293939. eCollection 2023. PLoS One. 2023. PMID: 37943808 Free PMC article.
-
Predicting Alzheimer's Cognitive Resilience Score: A Comparative Study of Machine Learning Models Using RNA-seq Data.bioRxiv [Preprint]. 2024 Aug 26:2024.08.25.609610. doi: 10.1101/2024.08.25.609610. bioRxiv. 2024. PMID: 39253457 Free PMC article. Preprint.
-
Non-invasive early-stage cancer detection: current methods and future perspectives.Clin Exp Med. 2024 Dec 21;25(1):17. doi: 10.1007/s10238-024-01513-x. Clin Exp Med. 2024. PMID: 39708168 Free PMC article. Review.
-
RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models.Heliyon. 2025 Jan 6;11(2):e41488. doi: 10.1016/j.heliyon.2024.e41488. eCollection 2025 Jan 30. Heliyon. 2025. PMID: 39897847 Free PMC article. Review.
References
-
- Albrecht S, Sprang M, Andrade-Navarro MA, Fontaine JF (2021) seqQscorer: automated quality control of next-generation sequencing data using machine learning. Genome Biol 22(1). https://doi.org/10.1186/s13059-021-02294-2
-
- Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838. https://doi.org/10.1038/nbt.3300 - DOI - PubMed
-
- Alom MZ et al (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics (Switzerland) 8(3). https://doi.org/10.3390/electronics8030292
-
- Andreu-Perez J, Poon CCY, Merrifield RD, Wong STC, Yang GZ (2015) Big data for health. IEEE J Biomed Health Inform 19(4):1193–1208. https://doi.org/10.1109/JBHI.2015.2450362 - DOI - PubMed
-
- Andrews S (2010) FastQC: a quality control tool for high throughput sequence data [online]. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources