Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct;51(10):1442-1449.
doi: 10.1038/s41588-019-0494-8. Epub 2019 Sep 9.

A compendium of promoter-centered long-range chromatin interactions in the human genome

Affiliations

A compendium of promoter-centered long-range chromatin interactions in the human genome

Inkyung Jung et al. Nat Genet. 2019 Oct.

Abstract

A large number of putative cis-regulatory sequences have been annotated in the human genome, but the genes they control remain poorly defined. To bridge this gap, we generate maps of long-range chromatin interactions centered on 18,943 well-annotated promoters for protein-coding genes in 27 human cell/tissue types. We use this information to infer the target genes of 70,329 candidate regulatory elements and suggest potential regulatory function for 27,325 noncoding sequence variants associated with 2,117 physiological traits and diseases. Integrative analysis of these promoter-centered interactome maps reveals widespread enhancer-like promoters involved in gene regulation and common molecular pathways underlying distinct groups of human traits and diseases.

PubMed Disclaimer

Conflict of interest statement

Competing Interests Statement

Bing Ren is a co-founder of Arima Genomics, Inc.. Anthony Schmitt is an employee of Arima Genomics.

Figures

Figure 1.
Figure 1.. Genome-wide mapping of promoter-centered chromatin interactions in diverse human tissues and cell types.
a, A schematic of the pcHi-C procedure. b, Barplots of normalized promoter-centered chromatin interaction frequencies (y-axis) emanating from the ADAMTS1 promoter (translucent gray). The identified chromatin interactions are shown below the axis (purple loops). Highlighted in translucent yellow are cell/tissue type specific interactions. c, Boxplots showing the fold enrichment of the interaction frequencies between the active (colored dots) or bivalent promoters (gray dots) and each chromatin state. The 17 chromatin states shown were obtained by processing 18-state ChromHMM model after merging genic enhancer 1 and 2 annotations. Two-sided KS tests were performed between interactions originating from active promoter regions (colored dots) and those from bivalent promoters (gray dots) for the samples listed on the right (n = 21) (** P value < 0.01 and *** P value < 0.001). The chromatin states that interact more frequently with active promoters than bivalent promoters were highlighted in translucent yellow. The chromatin states that interact more frequently with bivalent promoters than active promoters were highlighted in translucent blue. For the boxplots, the box represents the interquartile range (IQR), and the whiskers correspond to the highest and lowest points within 1.5 × IQR.
Figure 2.
Figure 2.. Inference of target genes of cis-regulatory sequences from pcHi-C data.
a, Illustrative LocusZoom plot of eQTLs for VLDLR (top) and pcHi-C interactions in aorta tissue (bottom). Highlighted in translucent yellow are the VLDLR promoter and an eQTL connected by a pcHi-C interaction. Dots represent the P values of SNPs’ association with VLDLR gene expression levels in the aorta (data obtained from GTEx). Dots are also color-coded based on their Linkage Disequilibrium scores with a tagging SNP. The blue bars indicate the recombination rate. b, Browser snapshots of the POU3F3 locus, showing positive correlation between the H3K27ac signals at a distal cRE (bottom left) and expression levels (bottom middle) of the promoter connected by long-range chromatin interactions (bottom right). The significant chromatin interaction between the POU3F3 promoter and a distal cRE is shown at the top (translucent yellow). c, Boxplots illustrating the H3K27ac signals at the cREs (n = 7,712) connected by hippocampus (HC, colored by blue) specific pcHi-C interactions. These cREs are marked by higher levels of H3K27ac in hippocampus than in other cell/tissues types (one-sided KS test P value < 0.005). For the boxplots, the box represents the interquartile range (IQR), and the whiskers correspond to the highest and lowest points within 1.5 × IQR. d-f, Heatmaps demonstrate the enrichment of pcHi-C interactions (column in Fig. 2d), z-score transformed H3K27ac RPKM values at cREs (column in Fig. 2e), and z-score transformed RNA-seq FPKM values at the cREs’ putative target genes (column in Fig. 2f) for given cell/tissue-specific cRE-promoter pairs in the corresponding cell/tissue type (rows in Fig. 2d-f). KS test was performed between pcHi-C interaction frequencies, z-score transformed H3K27ac RPKM values, and z-score transformed RNA-seq FPKM values in the matched cell/tissue types (values in diagonal in each heatmap) and those in other cell/tissue types (values in off diagonal in each heatmap), demonstrating significant association of cRE-promoter pairs with cell/tissue-specific cRE H3K27ac signals and gene expression (two-sided KS test P value < 2.2 × 10−16).
Figure 3.
Figure 3.. Enhancer-like promoters involved in regulation of distal target genes.
a, Browser snapshots of the TMED4 locus showing the RefSeq genes (top), H3K27ac signals (middle, n = 24), and pcHi-C chromatin interactions (bottom). Highlighted in translucent blue are promoter-promoter pairs showing highly correlated H3K27ac signal and significant pcHi-C interactions. Highlighted in gray is an adjacent promoter of the TMED4. Shown below are Pearson correlation coefficient (PCC) values based on H3K27ac signals and links based on pcHi-C interactions, with MSC as the acronym for mesenchymal stem cell. b, Density plots showing distributions of PCC values (x-axis) of H3K27ac (blue, median of PCC = 0.45, n = 48,893), H3K4me1 (orange, median of PCC = 0.67, n = 48,893), and H3K4me3 (green, median of PCC = 0.64, n = 48,893) signals for P-P pcHi-C interactions together with a random expectation (gray, median of PCC = 0.02, n = 48,142). c, A pie chart showing the fraction of unique P-P interactions matched by eQTL associations, of which 5.7% are P-P interactions (n = 1,976) in 12 matched tissue types (n = 34,880). d, Browser snapshots of RNA-seq results between control (n = 2) and mutant (n = 2) clones with deletion of the core promoter regions of the ARIH2OS. The expression of the NCKIPSD gene, which displays chromatin interactions with the ARIH2OS gene promoter, was significantly down-regulated in the mutant clones (FDR adjusted P value obtained from cuffdiff with two mutant clones = 0.02). e, Browser snapshots showing the promoter containing eQTL (translucent yellow with a scissors symbol) targeted by sgRNAs and its distal target gene, ABCF3 (translucent green), together with H3K27ac and chromatin accessibility (DNase I). The relative mRNA expression levels of the ABCF3 quantified by RT-qPCR are shown below (* one-sided KS test P value < 0.05 derived from three mutant clones). Error bars indicate standard deviation of three mutant clones and y-axis indicates mean values.
Figure 4.
Figure 4.. Analysis of human diseases and physiological traits based on the putative target genes of GWAS SNPs.
a, Browser snapshots showing multiple cREs harboring GWAS-SNPs (translucent yellow with a scissors symbol) and their common target gene, NT5DC2 (translucent green), together with signals of H3K27ac (ChIP-seq) and chromatin accessibility (DNase I) (left). The DNA fragments containing these cREs interact with the NT5DC2 gene promoter region as evidenced by pcHi-C analysis (arcs). The relative mRNA expression levels of the NT5DC2 upon induced mutations of GWAS SNPs with sgRNAs were quantified by RT-qPCR (right). Error bars indicate standard deviation of two mutant clones with technical triplicates and y-axis indicates mean values. b, Hierarchical clustering of human diseases and traits (n = 687) based on similarities of the putative target genes of trait-associated SNPs and SNPs in LD. The color intensity of each dot indicates Pearson correlation coefficient (PCC) of the putative target genes between two diseases or traits. Color bars on the left and top demarcate the clusters. c, d, Shown are similarities, as measured by PCC, between traits (n = 687) in the same order as Fig. 4b, based on either the nearest genes of the GWAS SNPs (c) or the GWAS SNPs alone (d). The color intensity of each dot indicates PCC of target gene similarities between two traits. e, Hierarchical clustering of GO biological processes (each column, n = 126) for the trait clusters defined in Fig. 4b (each row, n = 40). Each entry indicates –log10(P value) of GO biological processes in the corresponding cluster obtained from DAVID. Several representative biological processes are highlighted.

References

    1. Welter D et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42, D1001–6 (2014). - PMC - PubMed
    1. Maurano MT et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–5 (2012). - PMC - PubMed
    1. Hindorff LA et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106, 9362–7 (2009). - PMC - PubMed
    1. Lettice LA et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum Mol Genet 12, 1725–35 (2003). - PubMed
    1. Uslu VV et al. Long-range enhancers regulating Myc expression are required for normal facial morphogenesis. Nat Genet 46, 753–8 (2014). - PubMed

Publication types