Machine learning and related approaches in transcriptomics
- PMID: 38852503
- DOI: 10.1016/j.bbrc.2024.150225
Machine learning and related approaches in transcriptomics
Abstract
Data acquisition for transcriptomic studies used to be the bottleneck in the transcriptomic analytical pipeline. However, recent developments in transcriptome profiling technologies have increased researchers' ability to obtain data, resulting in a shift in focus to data analysis. Incorporating machine learning to traditional analytical methods allows the possibility of handling larger volumes of complex data more efficiently. Many bioinformaticians, especially those unfamiliar with ML in the study of human transcriptomics and complex biological systems, face a significant barrier stemming from their limited awareness of the current landscape of ML utilisation in this field. To address this gap, this review endeavours to introduce those individuals to the general types of ML, followed by a comprehensive range of more specific techniques, demonstrated through examples of their incorporation into analytical pipelines for human transcriptome investigations. Important computational aspects such as data pre-processing, task formulation, results (performance of ML models), and validation methods are encompassed. In hope of better practical relevance, there is a strong focus on studies published within the last five years, almost exclusively examining human transcriptomes, with outcomes compared with standard non-ML tools.
Keywords: Deep learning; Disease prediction; Epigenetic modifications; Long-read sequencing; Machine learning; Microarray; RNA sequencing; Single-cell transcriptomics; Transcriptomics.
Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.
Similar articles
-
Enhancing novel isoform discovery: leveraging nanopore long-read sequencing and machine learning approaches.Brief Funct Genomics. 2024 Dec 6;23(6):683-694. doi: 10.1093/bfgp/elae031. Brief Funct Genomics. 2024. PMID: 39158328 Review.
-
Transcriptomics and epigenetic data integration learning module on Google Cloud.Brief Bioinform. 2024 Jul 23;25(Supplement_1):bbae352. doi: 10.1093/bib/bbae352. Brief Bioinform. 2024. PMID: 39101486 Free PMC article.
-
Computational solutions for spatial transcriptomics.Comput Struct Biotechnol J. 2022 Sep 1;20:4870-4884. doi: 10.1016/j.csbj.2022.08.043. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 36147664 Free PMC article. Review.
-
scBOL: a universal cell type identification framework for single-cell and spatial transcriptomics data.Brief Bioinform. 2024 Mar 27;25(3):bbae188. doi: 10.1093/bib/bbae188. Brief Bioinform. 2024. PMID: 38678389 Free PMC article.
-
The Staphylococcus aureus Transcriptome during Cystic Fibrosis Lung Infection.mBio. 2019 Nov 19;10(6):e02774-19. doi: 10.1128/mBio.02774-19. mBio. 2019. PMID: 31744924 Free PMC article.
Cited by
-
REDInet: a temporal convolutional network-based classifier for A-to-I RNA editing detection harnessing million known events.Brief Bioinform. 2025 Mar 4;26(2):bbaf107. doi: 10.1093/bib/bbaf107. Brief Bioinform. 2025. PMID: 40112338 Free PMC article.
-
Application of a metabolic network-based graph neural network for the identification of toxicant-induced perturbations.Toxicol Sci. 2025 Jul 1;206(1):19-29. doi: 10.1093/toxsci/kfaf065. Toxicol Sci. 2025. PMID: 40445920 Free PMC article.
-
Gene expression inference based on graph neural networks using L1000 data.Brief Bioinform. 2025 May 1;26(3):bbaf273. doi: 10.1093/bib/bbaf273. Brief Bioinform. 2025. PMID: 40505083 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous