RNA-seq data science: From raw data to effective interpretation
- PMID: 36999049
- PMCID: PMC10043755
- DOI: 10.3389/fgene.2023.997383
RNA-seq data science: From raw data to effective interpretation
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Keywords: RNA sequencing; bioinformatics; differential gene expression; high throughput sequencing; read alignment; transcriptome quantification.
Copyright © 2023 Deshpande, Chhugani, Chang, Karlsberg, Loeffler, Zhang, Muszyńska, Munteanu, Yang, Rotman, Tao, Balliu, Tseng, Eskin, Zhao, Mohammadi, P. Łabaj and Mangul.
Conflict of interest statement
ET was employed by the company Pacific Biosciences (United States). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures



References
-
- Berbers B., Saltykova A., Garcia-Graells C., Philipp P., Arella F., Marchal K., et al. (2020). Combining short and long read sequencing to characterize antimicrobial resistance genes on plasmids applied to an unauthorized genetically modified Bacillus. Sci. Rep. 10, 4310. 10.1038/s41598-020-61158-0 - DOI - PMC - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources