Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 5;26(6):1627-1640.e7.
doi: 10.1016/j.celrep.2019.01.041.

RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types

Affiliations

RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types

Gianni Monaco et al. Cell Rep. .

Abstract

The molecular characterization of immune subsets is important for designing effective strategies to understand and treat diseases. We characterized 29 immune cell types within the peripheral blood mononuclear cell (PBMC) fraction of healthy donors using RNA-seq (RNA sequencing) and flow cytometry. Our dataset was used, first, to identify sets of genes that are specific, are co-expressed, and have housekeeping roles across the 29 cell types. Then, we examined differences in mRNA heterogeneity and mRNA abundance revealing cell type specificity. Last, we performed absolute deconvolution on a suitable set of immune cell types using transcriptomics signatures normalized by mRNA abundance. Absolute deconvolution is ready to use for PBMC transcriptomic data using our Shiny app (https://github.com/giannimonaco/ABIS). We benchmarked different deconvolution and normalization methods and validated the resources in independent cohorts. Our work has research, clinical, and diagnostic value by making it possible to effectively associate observations in bulk transcriptomics data to specific immune subsets.

Keywords: RNA-seq; deconvolution; flow cytometry; gene modules; housekeeping; immune system; mRNA abundance; mRNA composition; mRNA heterogeneity; transcriptome.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1
Figure 1
Representation of the Sample Preparation and Data Collection PBMC aliquots from two cohorts were used for (1) RNA-seq of 29 immune cell types (S4 cohort) and (2) microarray and RNA-seq of PBMCs and immunophenotyping of the 29 immune cell types (S13 cohort). Four staining panels (panels 1–4) were used to sort and immunophenotype the 29 immune cell types (Table S1). Tfh, T follicular helper; Tregs, T regulatory; Th, T helper; CE, central memory; EM, effector memory; TE, terminal effector; MAIT, mucosal-associated invariant T; SM, switched memory; NSM, non-switched memory; Ex, exhausted; LD, low-density; C, classical; I, intermediate; NC, non-classical; mDCs, myeloid dendritic cells; pDCs, plasmocytoid dendritic cells. See also Table S1 for full name and markers information.
Figure 2
Figure 2
Relationship between Immune Cell Types, Determined Using log2 TPM Values (A) t-SNE analysis on the RNA-seq data of the 29 immune cell types and PBMCs. Results are shown in four separate plots to better distinguish the different cell types. Each plot highlights the PBMCs and the cell types of one of the four panels used for FACS. (B) Transcriptomic hematopoietic tree of the 29 immune cell types with progenitor cells fixed as the root of the tree.
Figure 3
Figure 3
Heatmap of DEGs between Each Immune Cell Type and Remaining Samples Modules of genes were found by hierarchical clustering on Euclidean distance. The most biologically relevant GO terms associated with each module are reported on the left. The top differentially expressed genes (DEGs) are reported on the right. See the full list in Table S3.
Figure 4
Figure 4
Comparison of the Gene Expression Profile of the Immune Cell Types from Our Dataset (Columns) with Four External Datasets (Rows) From the samples of each FACS panel in our dataset, we selected the top 1,000 variable genes and calculated the Spearman correlation with samples of external datasets. For the correlation, we used the cell type average of normalized expression values.
Figure 5
Figure 5
Two Aspects of mRNA Composition: Heterogeneity and Abundance (A and B) Heterogeneity. (A) The cumulative sum of the median TPM values of nine relevant cell types calculated from values sorted in decreasing order. The total sum of TPM values is always 106. (B) The minimum number of genes that contribute to 80% of total gene expression in the 29 cell types. This number corresponds to the dashed red line in (A). (C and D) Abundance. (C) mRNA scaling factors for the 29 immune cell types calculated with four methods (STAR Methods). For the clustering distance between rows, we used the Spearman correlation. (D) Pearson correlation matrix for the values reported in (C).
Figure 6
Figure 6
Absolute Deconvolution of RNA-Seq PBMC Samples (A) Exhaustive search for cell types that are suitable for deconvolution from PBMC-derived RNA-seq data. For each cell type, we report the mean and SD of Pearson correlations obtained by deconvolution of all possible combinations of cell types (merged and non-merged) that reconstitute a PBMC sample. Cell types that have been chosen for the deconvolution analysis in (B) are outlined in blue. (B) Comparison of deconvoluted and flow cytometry proportions on 17 immune cell types with respect to PBMCs. The concordance correlation coefficient (ccc) and the Pearson correlation coefficient (r) are shown on each plot.
Figure 7
Figure 7
Benchmarks and Validations of Different Deconvolution and Normalization Methods (A) Comparison of five deconvolution algorithms in the presence and absence of noise and at increasing size of the signature matrix. The total RMSE is calculated by using the estimated and ground-truth proportions of the 17 cell types of RNA-seq deconvolution. (B) Comparison of results obtained from deconvolution methods with and without constraints and using our signature matrix for RNA-seq deconvolution with either TPM values or absolute expression values (ABIS-seq). (C) Comparison of RNA-seq and microarray deconvolution results with different normalization methods. Each dot is a different cell type.

References

    1. Abbas A.R., Baldwin D., Ma Y., Ouyang W., Gurney A., Martin F., Fong S., van Lookeren Campagne M., Godowski P., Williams P.M. Immune response in silico (IRIS): immune-specific genes identified from a compendium of microarray expression data. Genes Immun. 2005;6:319–331. - PubMed
    1. Abbas A.R., Wolslegel K., Seshasayee D., Modrusan Z., Clark H.F. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS ONE. 2009;4:e6098. - PMC - PubMed
    1. Adlowitz D.G., Barnard J., Biear J.N., Cistrone C., Owen T., Wang W., Palanichamy A., Ezealah E., Campbell D., Wei C. Expansion of activated peripheral blood memory B cells in rheumatoid arthritis, impact of B cell depletion therapy, and biomarkers of response. PLoS ONE. 2015;10:e0128269. - PMC - PubMed
    1. Anders S., Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106. - PMC - PubMed
    1. Andrews S. Babraham Bioinformatics; 2010. FastQC.

Publication types

MeSH terms