Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan;28(1):122-131.
doi: 10.1101/gr.224436.117. Epub 2017 Dec 5.

Impact of regulatory variation across human iPSCs and differentiated cells

Affiliations

Impact of regulatory variation across human iPSCs and differentiated cells

Nicholas E Banovich et al. Genome Res. 2018 Jan.

Abstract

Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility, and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell-type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell-type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell-type-specific chromatin accessibility.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Systematic measurements of molecular phenotypes across reprogramming and differentiation. (A) Summary of data collection. (B) Correlation matrix of gene expression from our samples and samples from ENCODE (*) and GTEx. Our LCL samples cluster most closely with LCLs samples from ENCODE, while our iPSCs and iPSC-CM lines cluster most closely with H1-ESC (ENCODE) and heart (GTEx), respectively. Dark purple: GTEx bone marrow. (C) Violin plots representing per individual log2 of the average square distance from the mean (Supplemental Materials) for iPSC, LCL, and iPSC-CM gene expression levels. Plots for chromatin accessibility and DNA methylation levels are shown in Supplemental Figure S7.
Figure 2.
Figure 2.
Mechanisms of cell-type–specific regulatory variation. (A) QQ-plot of LCL and iPSC eQTL signal conditioned on LCL- and iPSC-specific caQTLs. Higher enrichment of LCL (iPSC) eQTLs among LCL (iPSC) caQTLs links cell-type–specific regulation of chromatin accessibility to cell-type–specific regulation of gene expression. (B) Chromatin accessibility signal around cell-specific caQTLs in corresponding cell types (black rectangles) and in other cell types. A lack of accessibility in other cell types suggests that cell-specific caQTLs often affect cell-specific accessible regions, e.g., C. (C,D) Examples of cell-type–specific regulatory effects of genetic variation. SNP is correlated with accessibility of an iPSC-specific open chromatin region in iPSCs only (C) or of a nonspecific open chromatin region in LCLs only (D). (E) Scatter plot of iPSC and LCL chromatin accessibility at iPSC-specific caQTLs. About 20% of iPSC-specific caQTLs are accessible in LCLs. Plot of LCL-specific caQTLs in Supplemental Figure S15. (F) Example of an iPSC-specific caQTL that is also an iPSC-specific eQTL. SNP rs9367277 is associated with both chromatin accessibility of a strong enhancer and with expression of the CD2AP gene in iPSCs. Interestingly, rs9367277 lies in a transposable element of the ERVL family, which is preferentially activated in embryonic stem cells (Kunarso et al. 2010).
Figure 3.
Figure 3.
Predicting chromatin activity from sequence using deep neural networks. (A) OrbWeaver is a four-layered neural network where the parameters of the first convolutional layer are fixed to known position weight matrices of human transcription factors. The activation function used in each of the convolutional and dense layers is the Rectified Linear Unit (ReLU). (B) The OrbWeaver model for one cell type poorly predicts open chromatin in other cell types (gray), highlighting that the model captures cell-type–specific regulatory elements. (C) Transcription factors important for each locus were identified using DeepLIFT scores; this panel illustrates the top key TFs for each of the seven categories of chromatin activity and the fraction of loci explained by them. (D) An example of a locus that is open in iPSCs and LCLs but was identified to be an iPSC-specific caQTL. The subpanels on the left show the raw ATAC-seq signal in each cell type stratified by genotype of the most significant SNP of the iPSC caQTL. The subpanels on the right show the marginal change in OrbWeaver predictions due to mutating the reference base at each position to an alternate base. The sequence shown corresponds to the shaded portion on the left subpanels, and the reported Δpred values correspond to the change between alleles of the most significant SNP. The TF important for this locus as identified by DeepLIFT is YB-1, a factor highly expressed in all three cell types. (E) Scatter plot comparing the observed allelic imbalance at iPSC caQTLs, estimated by WASP, and the predicted difference in median chromatin activity between haplotypes tagged by the two alleles of the causal SNP. Note that the OrbWeaver model was learned using the reference genome sequence alone and had no information regarding genetic variation in the population when learning the model parameters.
Figure 4.
Figure 4.
Modeling complex disease using iPSC-derived cells. (A) Heat map of enrichment P-values of GWAS signals near genes with cell-type–specific expression (Supplemental Materials). (B) Enrichments of SNPs associated with four different diseases in different partitions of the genome (computed using LDscore regression; point estimates ±95% confidence intervals). In both analyses, the autoimmune traits (multiple sclerosis [MS] or Crohn's disease [CD] and rheumatoid arthritis [RA]) show enrichment near genes and chromatin that are more active in LCLs, and the heart-related traits (coronary artery disease [CAD] and myocardial infarction [MI]) are enriched in iPSC-CM active regions.

References

    1. Aflaki E, Stubblefield BK, Maniwang E, Lopez G, Moaven N, Goldin E, Marugan J, Patnaik S, Dutra A, Southall N, et al. 2014. Macrophage models of Gaucher disease for evaluating disease pathogenesis and candidate drugs. Sci Transl Med 6: 240ra73. - PMC - PubMed
    1. Alasoo K, Rodrigues J, Mukhopadhyay S, Knights AJ, Mann AL, Kundu K, Consortium H, Hale C, Dougan G, Gaffney DJ, et al. 2017. Genetic effects on chromatin accessibility foreshadow gene expression changes in macrophage immune response. bioRxiv 10.1101/102392. - DOI
    1. Banovich NE, Lan X, McVicker G, van de Geijn B, Degner JF, Blischak JD, Roux J, Pritchard JK, Gilad Y. 2014. Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet 10: e1004663. - PMC - PubMed
    1. Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, et al. 2014. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 24: 14–24. - PMC - PubMed
    1. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. 2013. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods 10: 1213–1218. - PMC - PubMed

Publication types