Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct;48(10):1193-203.
doi: 10.1038/ng.3646. Epub 2016 Aug 15.

Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution

Affiliations

Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution

M Ryan Corces et al. Nat Genet. 2016 Oct.

Abstract

We define the chromatin accessibility and transcriptional landscapes in 13 human primary blood cell types that span the hematopoietic hierarchy. Exploiting the finding that the enhancer landscape better reflects cell identity than mRNA levels, we enable 'enhancer cytometry' for enumeration of pure cell types from complex populations. We identify regulators governing hematopoietic differentiation and further show the lineage ontogeny of genetic elements linked to diverse human diseases. In acute myeloid leukemia (AML), chromatin accessibility uncovers unique regulatory evolution in cancer cells with a progressively increasing mutation burden. Single AML cells exhibit distinctive mixed regulome profiles corresponding to disparate developmental stages. A method to account for this regulatory heterogeneity identified cancer-specific deviations and implicated HOX factors as key regulators of preleukemic hematopoietic stem cell characteristics. Thus, regulome dynamics can provide diverse insights into hematopoietic development and disease.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Interrogation of chromatin landscapes in primary blood cells
(a) Schematic of the human hematopoietic hierarchy shows the 13 primary cell types analyzed in this work. Granulocytes and megakaryocytes were excluded. The cell types comprising the CD34+ HSPCs are indicated. Colors used in this schematic are consistent throughout the manuscript. (b) Diagram of analyses performed using paired ATAC-seq and RNA-seq data in both primary human blood cells and primary patient AML cells. (c) Normalized ATAC-seq profiles at developmentally important genes. Profiles represent the union of all technical and biological replicates for each cell type. See Supplementary Table 1 for the exact number of technical and biological replicates for each cell type. Genomic coordinates of regions: GATA2 chr3:128,197,777–128,218,433; CEBPB chr20:48,800,260–48,904,715; GYPA chr4:145,020,689–145,070,000; BCL11B chr14:99,513,898–99,796,947; BLK chr8:11,343,117–11,429,285. All Y axis scales range from 0–10 in normalized arbitrary units. X axis scales indicated by scale bar. (d–g) Scatter plot showing correlation of (d) technical replicates, (e) different human donors, (f) ATAC-seq and DNase-seq data derived from CD34+ HSPCs, and (g) ATAC-seq HSCs with bulk CD34+ HSPCs. The R values reported are calculated from correlations of all peaks. Plots show 50,000 random peaks, each with at least 5 reads.
Figure 2
Figure 2. Distal regulatory elements enable accurate classification of the hematopoietic hierarchy
(a,b) Unsupervised hierarchical clustering of (a) RNA-seq (N=49) and (b) ATAC-seq (N=77) data from all replicates of 13 normal hematopoietic cell types. Values shown are Pearson correlation coefficients. Cluster purity quantifies the degree that cells of the same lineage (color coded in the key) are clustered together. For RNA-seq, clustering was performed using variance stabilizing transform-normalized expression data for all expressed annotated genes. For ATAC-seq, clustering was performed on all peaks using quantile normalized quantitative read coverage data. (c,d) Phylogenetic dendrograms of (c) RNA-seq and (d) ATAC-seq data showing inter-cell type correlations derived from aggregate averages of all biological and technical replicates. Length of tree branches represents Euclidean distance. Data represents the union of all technical and biological replicates for each cell type. (e,f) Hierarchical clustering of ATAC-seq profiles (N=77) mapping to (e) promoters and (f) distal regulatory elements. Values shown are Pearson correlation coefficients. Promoter-proximal peaks are defined as +/−1 kilobase from an annotated TSS. Distal element peaks are defined as those peaks greater than 1 kilobase from an annotated TSS. (g) ATAC-seq peaks in the TET2 locus show highly variable distal regulatory landscapes (left) and relatively constitutive expression of TET2 (right). Data represents the union of all technical and biological replicates for each cell type: HSC=7; CMP=8; GMP=7; MEP=7; CD8=5; NK=6. Error bars represent 1 standard deviation. Genomic coordinates: chr4:106031731–106073198. Y axis scale ranges from 0–10 in normalized arbitrary units.
Figure 3
Figure 3. Enhancer cytometry allows for deconvolution of the hematopoietic hierarchy
(a) Normalized ATAC-seq profiles of HSPC subsets and ensemble CD34+ HSPC DNase-seq profiles illustrating heterogeneity amongst CD34+ HSPC subpopulations. Predicted cell fractions based on flow cytometry of 6 healthy bone marrow donors are shown on the left and the nearest annotated genes are shown on the bottom. Genomic coordinates: PTPRG-AS1 chr3:62194000–62196000; LOC439933 chr4:35761750–35763750; MIR1915 chr10:21639750–21641750; HNRNPD chr4:83205250–83207250. Y axis scale ranges from 0–10 in normalized arbitrary units. (b) Schematic of enhancer cytometry from cell-type specific distal elements (right panel, N=735). The signature matrix heatmap has an upper threshold of 500 where all elements with signal greater than 500 appear red. (c,d) Benchmarking of enhancer cytometry using randomly permuted synthetic mixtures to test robustness to (c) sequential subtraction and (d) randomized mixture content. Test data and training data are non-overlapping. (c) Synthesized ground truths are equal mixtures of the remaining cell types. In the left-most column, all cell types are present in equal parts in the ground truth data. Cell types are then sequentially subtracted from the synthesized ground truth starting with HSC until only NK cells remain. Error bars represent the standard deviation of 100 random permutations. (e) Enhancer cytometry of ATAC-seq data derived from FACS-purified bone marrow CD34+ HSPCs. (f) Correlation of predicted fractional contribution of each HSPC cell type by enhancer cytometry versus flow cytometric “ground truth” data of input CD34+ cells. X-axis represents the same data as shown in (e).
Figure 4
Figure 4. Integrative analysis of the hematopoietic regulome refines transcriptional circuitry driving cell specification and enriches the understanding of human disease
(a) Transcription factor dynamics showing major TFs driving hematopoietic regulomes. The size of the circle represents the effect of that motif in driving accessibility in human blood cells. The relative distance between circles represents the co-occurrence of motifs throughout hematopoietic differentiation (see methods). (b,c) Usage of the (b) GATA and (c) PAX motif throughout hematopoietic differentiation. Values represent the relative deviation of the motif accessibility, a measure of motif usage, compared to that in HSCs. (d) Footprint analysis of the GATA motif in MEP and CLP cells. (e) Pearson correlation of motif accessibility with transcription factor expression plotted against the significance of this correlation for GATA (top) and PAX (bottom) motifs. Red dots represent DNA-binding factors found in the analysis in Supplementary Fig. 9b to bind the given motif. Gray dots represent all other DNA-binding factors. (f,g) Expression of (f) GATA1 and (g) PAX5 phenocopies the usage of the (b) GATA and (c) PAX motifs throughout hematopoietic differentiation (h–k) Relative deviation scores of chromatin accessibility within hematopoietic regulatory elements with GWAS SNPs for (h) mean corpuscular volume, (i) rheumatoid arthritis, (j) alopecia areata, and (k) Alzheimer’s disease (see methods). Darker red color is representative of enrichment of GWAS SNPs in the open chromatin regions of the given cell type.
Figure 5
Figure 5. Acute myeloid leukemia regulomes reveal a cooption of normal myelopoiesis
(a) Schematic of the leukemogenic process. (b) Mutation frequencies of HSCs isolated from AML patients (N=39). Color indicates the percent of cells harboring the given mutation as estimated from the variant allele frequency. Gray indicates a mutation present in leukemic cells but not observed in pHSC (i.e. a late mutation event, detection threshold = 2% of cells or 1% of alleles). Asterisks indicate the predicted first mutation. If a mutation is bi-allelic, the representative bar is divided in half. (c) Normalized ATAC-seq profiles at a control locus (chr19:36102236–36277236) from FACS-purified AML cell types. Profiles represent the union of all biological replicates for each cell type. Y axis scale ranges from 0–10 in normalized arbitrary units. (d,e) Mean variance of ATAC-seq signal across the linear genome as calculated by a moving average for (d) each leukemic cell stage and (e) their corresponding normal cell types (see methods). The distance from the center of the graph represents the variance. The position along the circumference represents the genomic position. (f) Normalized ATAC-seq profiles near GATA2 (left; chr3:128,197,777–128,218,433) and CEBPB (right; chr20:48,800,260–48,904,715). Profiles shown are for SU444. Y axis scale ranges from 0–10 in normalized arbitrary units. (g) Cumulative variance of AML ATAC-seq data explained by the first N principal components derived from normal hematopoiesis. (h) Myeloid development score derived from ATAC-seq data in normal blood cell types (N=4 biological replicates) and AML cell types.
Figure 6
Figure 6. Enhancer cytometry and single-cell regulomes support a model of regulatory heterogeneity and allow for deconvolution of AML-specific biology
(a) Enhancer cytometry deconvolution showing the predicted contribution of various normal cell types to the regulatory landscape of different AML cell types. (b) Schematic of single-cell ATAC-seq protocol and analysis (see methods). (c,d) Projection of ATAC-seq data derived from (c) single SU070 LSCs (N=71) and (d) single SU070 blast cells (N=42) onto the principal components derived from the normal hematopoietic hierarchy. (e,f) Relative density of (e) single LMPPs (N=68) and monocytes (N=90) and (f) single SU070 LSCs (N=62), SU070 blasts (N=42), SU353 LSCs (N=36), and SU353 blasts (N=52) projected onto a one-dimensional representation of the myeloid developmental progression. (g,h) Scatter plot showing the correlation of ATAC-seq data derived from SU353 blast cells with (g) the closest normal cell type (GMP) (R=0.86) and (h) the enhancer cytometry-defined synthetic normal (R=0.91). Cutoff for differential peaks is a log2(fold change) greater than 3. The R values reported are calculated from correlations of all peaks. Plots show 50,000 random peaks, each with at least 5 reads. (i) Comparison of AML cell types to synthetic normal analogs. For each sample, the closest normal cell type is indicated by the color of the bar. The percent of the total significant peaks (called by closest normal comparison) that are removed by comparison to synthetic normal analogs is plotted for each sample.
Figure 7
Figure 7. Early chromatin accessibility alterations within pHSCs cause defects in differentiation which correlate with adverse patient outcomes
(a) K-means clustering was used to identify 7 clusters of co-varying peaks, termed regulatory modules (see methods). (b) Enrichment of each regulatory module shown in (a) at each stage of leukemia evolution. All biological replicates for each AML cell type were merged. Error bars shown represent 1 S.D. across all samples of that given cell type. (c) Enrichment and hierarchical clustering of motifs enriched in each of the 7 AML-specific regulatory modules. (d,e) Retention of CD34 expression as measured by flow cytometric analysis after 6 days of enforced differentiation down the (d) myeloid lineage and (e) erythroid lineage. Error bars represent 1 S.D. Experiments done in triplicate. (f,g) Fold change in the percent of cells expressing CD34 as measured by flow cytometric analysis of cord blood-derived HSCs transduced with shRNAs targeting HOXA9 or a non-targeting control. CD34 expression was measured after 6 days of differentiation down the (f) myeloid or (g) erythroid lineage. Only GFP+ transduced cells analyzed. Error bars represent 1 S.D. Experiments done in triplicate. (h) Overall and (i) relapse-free survival of patients stratified by pre-leukemic burden (High burden, N=24; Low burden N=15). High pre-leukemic burden defined as > 20% of HSCs harboring at least the first pre-leukemic mutation. P values comparing two Kaplan-Meier survival curves were calculated using the log-rank (Mantel-Cox) test. Hazard ratios (HR) were determined using the Mantel-Haenszel approach. **p<0.01, ***p<0.001, ****p<0.0001 derived from two-tailed t-test

Similar articles

Cited by

References

    1. Quesenberry PJ, Colvin GA. Williams Hematology. McGraw-Hill; 2005. Hematopoietic Stem Cells, Progenitor Cells, and Cytokines.
    1. Ji H, et al. Comprehensive methylome map of lineage commitment from haematopoietic progenitors. Nature. 2010;467:338–342. - PMC - PubMed
    1. Lara-Astiaso D, et al. Chromatin state dynamics during blood formation. Science. 2014;55:1–10. - PMC - PubMed
    1. Chen L, et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science. 2014;345:1251033–1251033. - PMC - PubMed
    1. Novershtern N, et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell. 2011;144:296–309. - PMC - PubMed

Publication types