Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024:2822:263-290.
doi: 10.1007/978-1-0716-3918-4_18.

RNA-Seq Data Analysis

Affiliations

RNA-Seq Data Analysis

James Li et al. Methods Mol Biol. 2024.

Abstract

RNA-Seq data analysis stands as a vital part of genomics research, turning vast and complex datasets into meaningful biological insights. It is a field marked by rapid evolution and ongoing innovation, necessitating a thorough understanding for anyone seeking to unlock the potential of RNA-Seq data. In this chapter, we describe the intricate landscape of RNA-seq data analysis, elucidating a comprehensive pipeline that navigates through the entirety of this complex process. Beginning with quality control, the chapter underscores the paramount importance of ensuring the integrity of RNA-seq data, as it lays the groundwork for subsequent analyses. Preprocessing is then addressed, where the raw sequence data undergoes necessary modifications and enhancements, setting the stage for the alignment phase. This phase involves mapping the processed sequences to a reference genome, a step pivotal for decoding the origins and functions of these sequences.Venturing into the heart of RNA-seq analysis, the chapter then explores differential expression analysis-the process of identifying genes that exhibit varying expression levels across different conditions or sample groups. Recognizing the biological context of these differentially expressed genes is pivotal; hence, the chapter transitions into functional analysis. Here, methods and tools like Gene Ontology and pathway analyses help contextualize the roles and interactions of the identified genes within broader biological frameworks. However, the chapter does not stop at conventional analysis methods. Embracing the evolving paradigms of data science, it delves into machine learning applications for RNA-seq data, introducing advanced techniques in dimension reduction and both unsupervised and supervised learning. These approaches allow for patterns and relationships to be discerned in the data that might be imperceptible through traditional methods.

Keywords: Differential expression; Next-generation sequencing; Sequence alignment; Transcriptomics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
RNASeq Analysis Pipeline.
Fig. 2:
Fig. 2:
A clustering result presented on the 2-dimensional space of PCA, each spot represents a sample. PC1 and PC2 are the first 2 PCs from multiple genes by RNASeq. The circle and triangle represent 2 k-means clusters.
Fig. 3:
Fig. 3:
The heatmap showcases a bi-clustering dendrogram arrangement. In this depiction, each row signifies a distinct gene, while columns stand for individual samples. The intersecting cells on the heatmap indicate the expression level of a specific gene within a given sample.

References

    1. Garber M, Grabherr MG, Guttman M, et al. (2011). Computational methods for transcriptome annotation and quantification using RNA-seq. Nature methods, 8(6): 469–477. - PubMed
    1. Martin JA, Wang Z, (2011). Next-generation transcriptome assembly. Nature Reviews Genetics, 12(10): 671–682. - PubMed
    1. Conesa A, Madrigal P, Tarazona S, et al. (2016). A survey of best practices for RNA-seq data analysis. Genome Biology, 17(1): 1–19. - PMC - PubMed
    1. Wang L, Wang S, Li W, (2012). RSeQC: quality control of RNA-seq experiments. Bioinformatics, 28(16): 2184–2185. - PubMed
    1. Dobin A, and Gingeras TR, (2015). Mapping RNA-seq Reads with STAR. Current protocols in bioinformatics, 51(1): 1–11. - PMC - PubMed

LinkOut - more resources