Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct;124(1):e68.
doi: 10.1002/cpmb.68. Epub 2018 Sep 17.

RNA-seq: Basic Bioinformatics Analysis

Affiliations

RNA-seq: Basic Bioinformatics Analysis

Fei Ji et al. Curr Protoc Mol Biol. 2018 Oct.

Abstract

Quantitative analysis of gene expression is crucial for understanding the molecular mechanisms underlying genome regulation. RNA-seq is a powerful platform for comprehensive investigation of the transcriptome. In this unit, we present a general bioinformatics workflow for the quantitative analysis of RNA-seq data and describe a few current publicly available computational tools applicable at various steps of this workflow. These tools comprise a pipeline for quality assessment and quantitation of RNA-seq data that starts from raw sequencing files and is focused on the identification and analysis of genes that are differentially expressed between biological conditions. © 2018 by John Wiley & Sons, Inc.

Keywords: RNA-seq; bioinformatics; differentially expressed genes; quantitative analysis of gene expression.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest

The authors have declared no conflicts of interest for this article.

Figures

Figure 1.
Figure 1.
Example of MDS plot of six RNA-seq samples. Each sample is shown as a two-dimensional point represented by text (sample name) and colored by group (condition). In this particular case, the group of three biological replicates from mutant samples (mut, shown in red) is well separated from the group of three biological replicates from wild-type samples (WT, shown in black). In the WT group, replicate 3 (WT.rep3) is separated from other replicates of the same group and may be a potential outlier.
Figure 2.
Figure 2.
Example of a Volcano plot. Each gene is represented as a point in the space of absolute difference in expression value between two compared groups of replicates (log2 of fold change, logFC) as the x axis and the statistical significance of this difference (-log10 of FDR or P-value) as the y-axis. The plot has a characteristic shape reflecting a general relationship between fold change and statistical significance. The overall distribution of points is usually symmetrical between up-regulated genes (points to the right of x=0) and down-regulated genes (points to the left of x=0), with the majority of points located near the origin, which corresponds to small and statistically insignificant differences. Differentially expressed genes (highlighted in red) are defined here by the cutoffs of 2-fold change (logFC > 1 or logFC < −1) and statistical significance < 0.01.
Figure 3.
Figure 3.
Heatmap of expression values of differentially expressed genes across individual samples. Clustering of expression patterns of samples (columns) and genes (rows) is represented by the dendrograms on top and on the left, respectively. Color indicates expression value (log10 of RPKM).

Similar articles

Cited by

References

    1. Anders S, Pyl PT and Huber W, 2015. HTSeq a Python framework to work with high-throughput sequencing data. Bioinformatics, 31(2), pp.166–169. - PMC - PubMed
    1. Anders S and Huber W, 2012. Differential expression of RNA-Seq data at the gene level-the DESeq package Heidelberg, Germany: European Molecular Biology Laboratory (EMBL).
    1. Bray NL, Pimentel H, Melsted P, Pachter L., 2016. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology 34, p. 525–527. - PubMed
    1. DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G., 2012. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics 28:1530–2. - PMC - PubMed
    1. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M and Gingeras TR, 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), pp.15–21. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources