A scoping review on deep learning for next-generation RNA-Seq. data analysis
- PMID: 37084004
- DOI: 10.1007/s10142-023-01064-6
A scoping review on deep learning for next-generation RNA-Seq. data analysis
Abstract
In the last decade, transcriptome research adopting next-generation sequencing (NGS) technologies has gathered incredible momentum amongst functional genomics scientists, particularly amongst clinical/biomedical research groups. The progressive enfoldment/adoption of NGS technologies has incited an abundance of next-generation transcriptomic data harbouring an opulence of new knowledge in public databases. Nevertheless, knowledge discovery from these next-generation RNA-Seq. data analysis necessitates extensive bioinformatics know-how besides elaborate data analysis software packages consistent with the type and context of data analysis. Several reliability and reproducibility concerns continue to impede RNA-Seq. data analysis. Characteristic challenges comprise of data quality, hardware and networking provisions, selection and prioritisation of data analysis tools, and yet significantly implementing of robust machine learning algorithms for maximised exploitation of these experimental transcriptomic data. Over the years, numerous machine learning algorithms have been implemented for improved transcriptomic data analysis executing predominantly shallow learning approaches. More recently, deep learning algorithms are becoming more mainstream, and enactment for next-generation RNA-Seq. data analysis could be revolutionary in the coming years in the biomedical domain. In this scoping review, we attempt to determine the existing literature's size and potential nature in deep learning and NGS RNA-Seq. data analysis. An analysis of the contemporary topics of next-generation RNA-Seq. data analysis based on deep learning algorithms is critically reviewed, emphasising open-source resources.
Keywords: Data analysis; Deep learning; Functional genomics; Machine learning; NGS; Omics.
© 2023. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
References
-
- Albrecht S, Sprang M, Andrade-Navarro MA, Fontaine JF (2021) seqQscorer: automated quality control of next-generation sequencing data using machine learning. Genome Biol 22(1). https://doi.org/10.1186/s13059-021-02294-2
-
- Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838. https://doi.org/10.1038/nbt.3300 - DOI - PubMed
-
- Alom MZ et al (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics (Switzerland) 8(3). https://doi.org/10.3390/electronics8030292
-
- Andreu-Perez J, Poon CCY, Merrifield RD, Wong STC, Yang GZ (2015) Big data for health. IEEE J Biomed Health Inform 19(4):1193–1208. https://doi.org/10.1109/JBHI.2015.2450362 - DOI - PubMed
-
- Andrews S (2010) FastQC: a quality control tool for high throughput sequence data [online]. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
