Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 15;30(12):i113-20.
doi: 10.1093/bioinformatics/btu274.

Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation

Affiliations

Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation

Tarmo Äijö et al. Bioinformatics. .

Abstract

Motivation: Gene expression profiling using RNA-seq is a powerful technique for screening RNA species' landscapes and their dynamics in an unbiased way. While several advanced methods exist for differential expression analysis of RNA-seq data, proper tools to anal.yze RNA-seq time-course have not been proposed.

Results: In this study, we use RNA-seq to measure gene expression during the early human T helper 17 (Th17) cell differentiation and T-: cell activation (Th0). To quantify Th17-: specific gene expression dynamics, we present a novel statistical methodology, DyNB, for analyzing time-course RNA-seq data. We use non-parametric Gaussian processes to model temporal correlation in gene expression and combine that with negative binomial likelihood for the count data. To account for experiment-: specific biases in gene expression dynamics, such as differences in cell differentiation efficiencies, we propose a method to rescale the dynamics between replicated measurements. We develop an MCMC sampling method to make inference of differential expression dynamics between conditions. DyNB identifies several known and novel genes involved in Th17 differentiation. Analysis of differentiation efficiencies revealed consistent patterns in gene expression dynamics between different cultures. We use qRT-PCR to validate differential expression and differentiation efficiencies for selected genes. Comparison of the results with those obtained via traditional timepoint-: wise analysis shows that time-course analysis together with time rescaling between cultures identifies differentially expressed genes which would not otherwise be detected.

Availability: An implementation of the proposed computational methods will be available at http://research.ics.aalto.fi/csb/software/

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Transcriptome dynamics of Th17 marker genes. (A) T helper precursor cells isolated from cord blood are activated using plate-bound αCD3 and soluble αCD28 in the presence of αIFN-γ and αIL-4 yielding the cells to follow the Th0 lineage. Th17 commitment is achieved by activation and polarization condition, including IL-6, IL-1β and TGF-β. Cells were harvested at 0, 12, 24, 48, and 72 h. From the harvested cells the RNA was extracted and used for preparation an RNA-seq library. (B) The estimated smooth representation of IL17A dynamics without time scaling. The read counts are on the y-axis. Circles and diamonds mark the measurements from Th0 and Th17 cells, respectively, and the replicates are distinguish with different colors. The solid curves are the posterior means of the specific Th0 and Th17 models (M1) with corresponding 95% CIs (shaded areas around means). (C and D) Same as (B), but the depicted results are for the IL17F and RORC genes
Fig. 2.
Fig. 2.
Modeling differentiation dynamics. (A) An illustration showing the effects of the time scaling. The axis in the center panel shows the unscaled time axis. The axes in the top and bottom panels show the maximum allowed deceleration (−32 h at 72 h) and acceleration (32 h at 72 h) relative to the unscaled case, respectively. (B) The estimated smooth representation of the simulated data with the time scaling. The first replicate is a delayed version of the second and third replicates. The red arrows illustrate how much the measurements are effectively moved due to the time scaling. (C) The posterior distribution of the time differences at timepoint 72 h
Fig. 3.
Fig. 3.
Perturbated differentiation dynamics. (A) The estimated smooth representation of IL17A dynamics with the time scaling. The red arrows illustrate how much the measurements are effectively moved due to the time scaling. (B and C) Same as (A), but the depicted results are for the IL17F and RORC genes
Fig. 4.
Fig. 4.
Validation of marker gene expression. (A) qRT-PCR time-series measurements of IL17A mRNA levels in the same samples where RNA-seq was performed. The error bars are depicting the SDs. The colors distinguish the different samples. (B and C) Same as (A) but for IL17F and RORC, respectively. (D) The scatter plots illustrating the replicate-specific correspondence between the qRT-PCR and RNA-seq gene expression estimates of the IL17A (top panel), IL17F (middle panel) and RORC genes (bottom panel) over time in Th0 cells. The correlation is quantified using the Pearson correlation coefficient (r). (E) Same as in (D) but for Th17 cells
Fig. 5.
Fig. 5.
The replicate-specific differentiation efficiencies. (A) Density plots representing the distribution of estimated time differences in gene level in the Th0 and Th17 lineages. A gene is on diagonal if the estimated time differences in the Th0 and Th17 cells are the same. The results for the first and third replicate are depicted in top panel and bottom panel, respectively. (B) Presence of time scaling in Th0 lineage among the 698 differentially expressed genes. The dashed line represents the prior distribution of the amount of time scaling at 72 h. The red area shows the posterior distribution of the time scaling for the first replicate and the purple shows the posterior distribution for the third replicate. (C) Same as (B) but here the focus is on Th17 lineage. The focus is on the differentially expressed genes in (B and C)
Fig. 6.
Fig. 6.
A comparison of the results with DESeq. (A) The overlap between the sets of differentially expressed genes identified by DyNB and DESeq (top panel). In the bottom panel we take into account only the top 698 hits from DESeq analysis to make the gene sets equal in size. (B) The number of the top 698 DESeq hits that are found to be differentially expressed exactly at one, two, three or four timepoints in the DESeq analysis. (C) A quantification of how the genes belonging to the classes presented in (B) are found by the presented method using the precision metric. The DESeq hits are taken into account in the order of descending significance (x-axis), which are used to evaluate precisions. For example, precision is one when all the considered genes are found in the set given by DyNB
Fig. 7.
Fig. 7.
Examples of differentially expressed genes detected exclusively by DyNB. (A) The estimated smooth representation of ISG20 dynamics with the time scaling. (B and C) Same as (A), but the depicted results are for the RAB13 and TIAM1 genes

References

    1. Adams RP, et al. Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY: ICML ’09. ACM; 2009. Tractable nonparametric Bayesian inference in Poisson processes with Gaussian process intensities; pp. 9–16.
    1. Äijö T, et al. An integrative computational systems biology approach identifies differentially regulated dynamic transcriptome signatures which drive the initiation of human T helper cell differentiation. BMC Genomics. 2012;13:572. - PMC - PubMed
    1. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106. - PMC - PubMed
    1. Brucklacher-Waldert V, et al. Phenotypical characterization of human Th17 cells unambiguously identified by surface IL-17A expression. J. Immunol. 2009;183:5494–5501. - PubMed
    1. Conesa A, Nueda MJ. Next-masigpro: dealing with RNA-seq time series. EMBnet J. 2013;19:42–43.

Publication types

LinkOut - more resources