Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec 5:2018:1338-1347.
eCollection 2018.

Integration of Transcriptomic Data Identifies Global and Cell-Specific Asthma-Related Gene Expression Signatures

Affiliations

Integration of Transcriptomic Data Identifies Global and Cell-Specific Asthma-Related Gene Expression Signatures

Mengyuan Kan et al. AMIA Annu Symp Proc. .

Abstract

Over 140,000 transcriptomic studies performed in healthy and diseased cell and tissue types, at baseline and after exposure to various agents, are available in public repositories. Integrating results of transcriptomic datasets has been an attractive approach to identify gene expression signatures that are more robust than those obtained for individual datasets, especially datasets with small sample size. We developed Reproducible Analysis and Validation of Expression Data (RAVED), a pipeline that facilitates the creation of R Markdown reports detailing reproducible analysis of publicly available transcriptomic data, and used it to analyze asthma and glucocorticoid response microarray and RNA-Seq datasets. Subsequently, we used three approaches to integrate summary statistics of these studies and identify cell/tissue-specific and global asthma and glucocorticoid-induced gene expression changes. Transcriptomic integration methods were incorporated into an online app called REALGAR, where end-users can specify datasets to integrate and quickly obtain results that may facilitate design of experimental studies.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Steps followed by RAVED to analyze microarray and RNA-Seq data are provided on the left. Specific programs and R packages (in bold) used for each step are shown on the right.
Figure 2.
Figure 2.
Overview of the integration analysis. 25 transcriptomic asthma and glucocorticoid datasets, consisting of various asthma endotypes and cell/tissue types, were integrated using three summary statistic-based methods.
Figure 3.
Figure 3.
Sample of RAVED quality control outputs. Boxplots of relative log expression (RLE) and normalized unscaled standard error (NUSE) (left), MA plots (top right) and spatial plots (bottom right) are generated during the microarray data quality control procedure.
Figure 4.
Figure 4.
Comparison of genes with significant expression changes across cell/tissue types for asthma studies (left) and glucocorticoid response studies (center), and in asthma vs. glucocorticoid response datasets across all cell/tissue types (right). The total number of genes available for each comparison is shown on the bottom right of the corresponding diagram.
Figure 5.
Figure 5.
Integration results for NSFL1C. A) Q-values corresponding to integration of all cell/tissue types, blood cells, and structural cells for asthma and glucocorticoid (GC) datasets obtained via three methods. Dashed line indicates q-value=0.05. B) Overall fold changes from effect size-based integration (circles and triangles) and fold changes from individual studies (dots). Bars represent 95% confidence intervals. Dashed line indicates fold change=1.
Figure 6.
Figure 6.
Representative distribution of effect sizes from an individual microarray study GSE65401 (left) and an RNA-Seq study SRP033351 (right).

Similar articles

Cited by

References

    1. Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC, Ledoux P. NCBI GEO: mining millions of expression profiles--database and tools. Nucleic Acids Res. 2005;33(Database issue):D562–6. - PMC - PubMed
    1. Leinonen R, Sugawara H, Shumway M. International Nucleotide Sequence Database C. The sequence read archive. Nucleic Acids Res. 2011;39(Database issue):D19–21. - PMC - PubMed
    1. Ioannidis JP, Khoury MJ. Improving validation practices in “omics” research. Science. 2011;334(6060):1230–2. - PMC - PubMed
    1. Kodama K, Horikoshi M, Toda K, Yamada S, Hara K, Irie J. Expression-based genome-wide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Proc Natl Acad Sci U S A. 2012;109(18):7049–54. - PMC - PubMed
    1. Sweeney TE, Wong HR, Khatri P. Robust classification of bacterial and viral infections via integrated host gene expression diagnostics. Sci Transl Med. 2016;8(346):346ra91. - PMC - PubMed

Publication types

Substances