Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul;24(7):1138-46.
doi: 10.1101/gr.161992.113. Epub 2014 Mar 20.

Extreme HOT regions are CpG-dense promoters in C. elegans and humans

Affiliations

Extreme HOT regions are CpG-dense promoters in C. elegans and humans

Ron A-J Chen et al. Genome Res. 2014 Jul.

Abstract

Most vertebrate promoters lie in unmethylated CpG-dense islands, whereas methylation of the more sparsely distributed CpGs in the remainder of the genome is thought to contribute to transcriptional repression. Nonmethylated CG dinucleotides are recognized by CXXC finger protein 1 (CXXC1, also known as CFP1), which recruits SETD1A (also known as Set1) methyltransferase for trimethylation of histone H3 lysine 4, an active promoter mark. Genomic regions enriched for CpGs are thought to be either absent or irrelevant in invertebrates that lack DNA methylation, such as C. elegans; however, a CXXC1 ortholog (CFP-1) is present. Here we demonstrate that C. elegans CFP-1 targets promoters with high CpG density, and these promoters are marked by high levels of H3K4me3. Furthermore, as for mammalian promoters, high CpG content is associated with nucleosome depletion irrespective of transcriptional activity. We further show that highly occupied target (HOT) regions identified by the binding of a large number of transcription factors are CpG-rich promoters in C. elegans and human genomes, suggesting that the unusually high factor association at HOT regions may be a consequence of CpG-linked chromatin accessibility. Our results indicate that nonmethylated CpG-dense sequence is a conserved genomic signal that promotes an open chromatin state, targeting by a CXXC1 ortholog, and H3K4me3 modification in both C. elegans and human genomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
HOT regions display promoter features. H3K4me3 and H3K4me1 signals plotted at the centers of core TF overlap regions ranked by the indicated percentile of TF occupancy in humans and C. elegans. (A) For human TF overlap regions, 7419 (top 1%), 30, 945 (top 2%–5%), a random selection of 100,000 of 241,975 (top 6%–30%), and 456,812 (bottom 70%) regions are plotted. (B) For C. elegans TF overlap regions, 376 (top 1%), 1429 (top 2%–5%), 9721 (top 6%–30%), and 23,536 (bottom 70%) regions are plotted. In each TF overlap band, the number of TFs and the percentage of TF core midpoints ±500 bp of a TSS are indicated. Scales show linear (human) or log2 (C. elegans) input normalized signal ranges.
Figure 2.
Figure 2.
C. elegans HOT regions are functional promoters. (A) The indicated HOT regions (orange) were cloned directly upstream of a histone∷GFP fusion gene and examined for the expression of GFP using transgenic reporter assays. (B) Ten of ten assayed regions drove GFP expression. The expression in representative larvae stage 4 worms is shown. Coordinates of cloned regions and further information on expression patterns are shown in Supplemental Table S1.
Figure 3.
Figure 3.
HOT regions are enriched for CpG dinucleotides and depleted for nucleosomes. (A) Frequency of the indicated mono- and dinucleotides in HOT (red) and COLD (blue) regions is shown relative to the genome-wide frequency scaled to one (black horizontal line). (B) Heat maps showing the distribution of CpG density in ranked TF overlap regions in human and C. elegans. The color scheme shows the scale (0 to 15) for CpG content in a 200-bp window. (C) The distribution of CpG density (top) and nucleosomes (bottom) was plotted for HOT (red) and COLD (blue) regions in the human and C. elegans genomes. Lines show mean signal, darker filled areas show standard error, and lighter filled areas are 95% confidence intervals. All plots show 2-kb regions centered at the midpoint of core regions.
Figure 4.
Figure 4.
A promoter harboring a CpG dense region is associated with nucleosome depletion in C. elegans. (A) Plots of mononucleosome and CpG distributions across promoters of ubiquitously expressed genes in the top 20% of expression band, separated by high CpG density (red, top 20%) or low CpG density (dark gray, bottom 20%). (B) Mononucleosome and CpG distributions were analyzed for ubiquitously active promoters within 20% of CpG density (protein coding promoters) and separated into the top 20% (red) or second 20% (blue) of expression. (C) The distribution of mononucleosomes and CpGs across promoters with the top 20% CpG density separated into those with high (top 20%, pink), medium (middle 20%, blue), or low/no expression (bottom 40%, orange). Lines show mean signal, darker filled areas show standard error, and lighter filled areas are 95% confidence intervals.
Figure 5.
Figure 5.
C. elegans CFP-1 is targeted to CpG-rich promoters marked by H3K4me3. (A) A representative screen shot for the distribution of CFP-1 (red) and H3K4me3 (blue) normalized ChIP signals in late embryos. (B) Venn diagram showing the overlap between CFP-1 and H3K4me3 ChIP ChIP-seq peaks. Numbers in the overlap region are not equal because single peaks in one data set may overlap more than one peak in the other data set. (C) Percentage of CFP-1 peaks overlapping promoter regions (orange; ±500 bp of coding TSSs) (Chen et al. 2013; Kruesi et al. 2013), gene bodies (green), and the remaining genomic regions (blue). (D) Signal distributions for CFP-1, H3K4me3, and CpG density plotted in 2-kb windows centered by coding TSSs identified in Chen et al. (2013) in heat maps ranked by the density of promoter CpGs. The color scheme shows the scale (0–15) for the indicated signal. (E) Plots of CFP-1 ChIP signal, CpG content, mononucleosome pattern, and H3K4me3 signal across promoters of active ubiquitous genes in the top 20% of expression with high (red, top 20%), or low (dark gray, bottom 20%) CFP-1 occupancy. (F) Plots of CFP-1 ChIP signal, CpG content, mononucleosome pattern, and H3K4me3 ChIP signal across ubiquitous promoters highly targeted by CFP-1 (top 20%) and with high (red, top 20%) or low (blue, second 20%) expression levels. All plots display the indicated features in 2-kb windows centered at TSSs. Lines show mean signal, darker filled areas show standard error, and lighter filled areas are 95% confidence intervals.

Similar articles

Cited by

References

    1. Allen MA, Hillier LW, Waterston RH, Blumenthal T 2011. A global analysis of C. elegans trans-splicing. Genome Res 21: 255–264 - PMC - PubMed
    1. Ansari KI, Mishra BP, Mandal SS 2008. Human CpG binding protein interacts with MLL1, MLL2 and hSet1 and regulates Hox gene expression. Biochim Biophys Acta 1779: 66–73 - PubMed
    1. Antequera F, Bird A 1999. CpG islands as genomic footprints of promoters that are associated with replication origins. Curr Biol 9: R661–R667 - PubMed
    1. Bektesh SL, Hirsh DI 1988. C. elegans mRNAs acquire a spliced leader through a trans-splicing mechanism. Nucleic Acids Res 16: 5692. - PMC - PubMed
    1. Bird AP 1986. CpG-rich islands and the function of DNA methylation. Nature 321: 209–213 - PubMed

Publication types