Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 13;25(1):156.
doi: 10.1186/s13059-024-03300-z.

CpG island turnover events predict evolutionary changes in enhancer activity

Affiliations

CpG island turnover events predict evolutionary changes in enhancer activity

Acadia A Kocher et al. Genome Biol. .

Abstract

Background: Genetic changes that modify the function of transcriptional enhancers have been linked to the evolution of biological diversity across species. Multiple studies have focused on the role of nucleotide substitutions, transposition, and insertions and deletions in altering enhancer function. CpG islands (CGIs) have recently been shown to influence enhancer activity, and here we test how their turnover across species contributes to enhancer evolution.

Results: We integrate maps of CGIs and enhancer activity-associated histone modifications obtained from multiple tissues in nine mammalian species and find that CGI content in enhancers is strongly associated with increased histone modification levels. CGIs show widespread turnover across species and species-specific CGIs are strongly enriched for enhancers exhibiting species-specific activity across all tissues and species. Genes associated with enhancers with species-specific CGIs show concordant biases in their expression, supporting that CGI turnover contributes to gene regulatory innovation. Our results also implicate CGI turnover in the evolution of Human Gain Enhancers (HGEs), which show increased activity in human embryonic development and may have contributed to the evolution of uniquely human traits. Using a humanized mouse model, we show that a highly conserved HGE with a large CGI absent from the mouse ortholog shows increased activity at the human CGI in the humanized mouse diencephalon.

Conclusions: Collectively, our results point to CGI turnover as a mechanism driving gene regulatory changes potentially underlying trait evolution in mammals.

Keywords: Comparative genomics; Gene regulation; Orphan CpG islands; Transcriptional enhancer evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
oCGIs are enriched for enhancer-associated histone modifications. A The number of oCGIs identified in nine mammalian genomes considered in this study. B Percent of oCGIs overlapping a histone modification peak for each indicated histone modification and tissue in rhesus macaque [31, 34, 36]. B = adult brain, L = adult liver, M = adult muscle, T = adult testis, DC = developing cortex, DL = developing limb. Gray horizontal lines indicate the expected overlap and stars indicate significant enrichment (q < 0.05, BH-corrected, determined by permutation test; see “ Methods”). C The level of each indicated histone modification in peaks with and without oCGIs, measured in reads per kilobase per million (RPKM). Box plots show the interquartile range and median, and whiskers indicate the 90% confidence interval. Stars indicate a significant difference between peaks with and without an oCGI (q < 0.05, Wilcoxon rank-sum test, BH-corrected). D Maximum phastCons LOD (log-odds) scores in peaks with and without oCGIs. Box plots show the interquartile range and median, and whiskers indicate the 90% confidence interval. Stars indicate a significant difference between peaks with and without an oCGI (q < 0.05, Wilcoxon rank-sum test, BH-corrected). E Evolutionary origin of peaks with and without oCGIs. Bar plots show the percentage of peaks with and without oCGIs whose oldest sequence belongs to each age category. The results shown in panels (C) through (E) were generated using peaks from adult brain in rhesus macaque; see Additional File 1: Figs. S8, S10, and S12 for results from additional species and tissues
Fig. 2
Fig. 2
oCGIs show extensive turnover across species. A Schematic illustrating how we defined species-specific oCGIs in pairwise comparisons, using rhesus macaque and mouse as an example. Left: a rhesus-only oCGI (the sequence is present in both rhesus and mouse, but the oCGI is only present in rhesus). Right: a shared oCGI (both the sequence and oCGI are present in both rhesus and mouse). Ticks under each oCGI represent the locations of CpG dinucleotides. B Percent of oCGIs across the indicated species pairs (species A versus species B) that are “A-only,” “B-only,” or “shared” as described in the main text. The species pair is shown under each bar, with species A denoted by a white circle and species B denoted by a black circle. Percentages of oCGIs that are species A-only (white), species B-only (black), or shared (gray) are shown. C Number of CpG dinucleotides in rhesus-only (dark blue) or mouse-only (light blue) oCGIs compared to shared (gray) oCGIs. Box plots show the interquartile range and median, and whiskers indicate the 90% confidence interval. Stars indicate significant differences (q < 0.05 Wilcoxon rank-sum test, BH-corrected). D Maximum phastCons LOD scores in rhesus-only, mouse-only, and shared oCGIs. Box plots show the interquartile range and median, and whiskers indicate the 90% confidence interval. Stars indicate significant differences (q < 0.05, Wilcoxon rank-sum test, BH-corrected). E Evolutionary origins of rhesus-only, mouse-only, and shared oCGI sequences
Fig. 3
Fig. 3
Species-specific oCGIs are significantly enriched for species-specific histone modification peaks. A Schematic illustrating how we defined species-specific and shared oCGIs and peaks. In each pairwise species comparison for each histone modification and tissue, we sorted oCGIs based on their species specificity (designated as A-only, B-only, or shared as in Fig. 2) and the species specificity of their histone modification peaks (shown in orange in the schematic). B An example of a rhesus macaque-specific oCGI overlapping a rhesus-specific H3K4me3 peak in a pairwise comparison with mouse. Ticks show the location of CpG dinucleotides. Normalized H3K4me3 signal at this locus (orange) is shown as read counts per million in adjacent 10-bp bins. C Enrichment and depletion in each indicated comparison of species-specific and shared oCGIs (top: A-only, B-only, Shared) and species-specific and shared peaks (left: A-only, B-only, Shared), compared to a null expectation of no association between oCGI turnover and peak turnover. Each 3 × 3 grid shows the results for a specific test examining oCGIs and their overlap with three histone modifications in adult rhesus macaque brain. Each box in each grid is colored according to the level of enrichment over expectation (orange for H3K4me3, green for H3K27ac, or purple for H3K4me1) or depletion (gray for all marks) of genome-wide sites that meet the criteria for that box. The color bar below each plot illustrates the level of enrichment or depletion over expectation. Filled upward-pointing triangles denote significant enrichment and open downward-pointing triangles denote significant depletion (q < 0.05, permutation test, BH-corrected, see Additional file 1: Fig. S22 and “ Methods”). D Enrichment and depletion in an additional species comparison, rat versus dog, and in additional tissues (liver, top, and muscle, bottom), shown as described in (C). E Maximum LOD score in species-specific oCGIs in species-specific peaks and shared oCGIs in shared peaks, using data from adult rhesus macaque brain. Box plots show the interquartile range and median, and whiskers indicate the 90% confidence interval. Stars indicate significance (q < 0.05, Wilcoxon rank-sum test, BH-corrected)
Fig. 4
Fig. 4
Association of species-specific oCGIs with species-specific histone modification peaks and HGEs in the developing human cortex and limb. A Enrichment and depletion in each indicated comparison of species-specific and shared oCGIs (top: A-only, B-only, Shared) and species-specific and shared peaks (left: A-only, B-only, Shared), compared to a null expectation of no association between oCGI turnover and peak turnover. Results are shown as in Fig. 3C,D, with enrichment in green for H3K27ac and yellow for H3K4me2, and depletion in gray. One representative comparison is shown for developing cortex (8.5 post-conception weeks (p.c.w.) in human versus embryonic day 14.5 in mouse) and developing limb (embryonic day 41 in human versus embryonic day 12.5 in mouse). B Enrichment of specific oCGI species patterns in HGEs compared to non-HGE enhancers in human cortex at 8.5 p.c.w. Bar plots show the percentage of HGEs (left bar) or non-HGE enhancers (right bar) that overlap an oCGI with the species pattern shown on the left. Significance was determined using a resampling test comparing HGEs to non-HGE human enhancers matched for overall histone modification levels (resampling test, BH-corrected; see Additional file 1: Fig. S34 and “ Methods”). C H3K27ac levels in developing diencephalon at the humanized hs754 (top tracks) or wild type (bottom tracks) mouse locus at E11.5. Locations of oCGIs within hs754 and its mouse ortholog are shown at the top (dark gray boxes for two human oCGIs not present in the mouse sequence, and a light gray box for a mouse oCGI not present in the human sequence). Dark green (humanized) and light green (wild type) signal tracks show normalized H3K27ac levels as counts per million reads calculated in adjacent 10-bp bins. Dark orange (humanized) and light orange (wild type) signal tracks show normalized H3K4me3 levels. Peak calls are shown as boxes below the signal tracks. Nominal p-values were obtained by DESeq2 using a Wald test, then BH-corrected for multiple testing across all peaks genome-wide to generate q-values (see values in main text and in Additional file 1: Fig. S38)
Fig. 5
Fig. 5
Species-specific oCGIs in species-specific peaks are associated with gene expression changes. A Schematic illustrating our method for assigning oCGIs and peaks to genes as described in the text and Additional file 1: Figure S42, using a pairwise comparison of rat and pig as an example. Left: A gene associated with a rat-only oCGI in a rat-only H3K27ac peak, which means the gene is assigned to the “rat-only set” (A-only set) of genes. Right: A gene associated with a pig-only oCGI in a pig-only H3K27ac peak, which means the gene is assigned to the “pig-only set” (B-only set) of genes. B The log2-transformed TPM ratio for genes in the A-only set and the B-only set for each indicated species pair and histone modification using data from adult brain. Points indicate median values for the A-only set (dark blue) and the B-only set (light blue) and lines indicate the interquartile range. All values in the A-only set and B-only set were normalized to the median TPM ratio across resampling rounds from the background set. Stars indicate a significant difference between the observed median and the expected median (q < 0.05, resampling test to compare to the background set, BH-corrected; see Additional file 1: Fig. S42 and “Methods”)
Fig. 6
Fig. 6
oCGI turnover is associated with changes in transcription factor binding. A Schematic illustrating how we compared species-specific oCGIs with species-specific transcription factor binding events in adult liver, using rhesus macaque and mouse as an example case. Left: a rhesus-only (species A-only) oCGI with a rhesus-only (species A-only) CTCF peak. Right: a shared oCGI with a shared CTCF peak. Ticks show the locations of CpG dinucleotides. B Left: the consensus motif for CTCF (MA1929.1 from the JASPAR database). Right: Enrichment and depletion in each indicated comparison of species-specific and shared oCGIs (top: A-only, B-only, Shared) and species-specific and shared CTCF peaks (left: A-only, B-only, Shared), compared to a null expectation of no association between oCGI turnover and peak turnover. Each 3 × 3 grid shows the results for a specific test examining oCGIs and their overlap with CTCF peaks. Each box in each grid is colored according to the level of enrichment over expectation (teal) or depletion (gray) of genome-wide sites that meet the criteria for that box. The color bar below each plot illustrates the level of enrichment or depletion over expectation. The filled upward-pointing triangles denote significant enrichment and open downward-pointing triangles denote significant depletion (q < 0.05, permutation test, BH-corrected; see Additional file 1: Fig. S22 and “Methods”). C Left: the consensus motif for FOXA1 (MA0148.1 from the JASPAR database). Right: Enrichment and depletion in each indicated comparison of species-specific and shared oCGIs (top: A-only, B-only, Shared) and species-specific and shared FOXA1 peaks (left: A-only, B-only, Shared). Shown as in (B) but with boxes colored according to the level of enrichment over expectation (red) and depletion (gray) of genome-wide sites that meet the criteria for that box
Fig. 7
Fig. 7
Model of enhancer evolution via oCGI turnover. A Evolution of a new enhancer from a locus in a closed chromatin state. This locus may include unconstrained, inaccessible TFBSs (striped boxes on DNA). DNA is depicted as a black line wrapped around cylindrical nucleosomes. After oCGI gain by several potential mechanisms (indicated in the figure), the site now acts as a proto-enhancer located within open, active chromatin recruited by the oCGI [42], which allows TFs to bind previously inaccessible TFBSs. A subset of histone tails (curved gray lines) with H3K4me3 (orange hexagons) and H3K27ac (green stars) modifications are shown. Filled lollipops indicate methylated CpGs, and unfilled lollipops indicate unmethylated CpGs. Over time, TFBSs become constrained (filled boxes on DNA) and additional TFBSs may arise and become fixed, resulting in the evolution of an enhancer with a constrained biological function. B Co-option of an existing enhancer in a novel biological context via oCGI gain. In an ancestral species, the enhancer is active in the developing limb and inactive in the developing brain, where the chromatin at the locus is closed. After oCGI gain, CpG-related mechanisms generate open chromatin in the developing brain, which allows existing unconstrained brain TFBSs to be bound. Over time, these and additional TFBSs may gain biological functions and be maintained by selection. The locus becomes a functional enhancer in the developing brain

Update of

References

    1. Reilly SK, Noonan JP. Evolution of Gene Regulation in Humans. Annu Rev Genom Hum G. 2016;17(1):45–67. doi: 10.1146/annurev-genom-090314-045935. - DOI - PubMed
    1. Whalen S, Pollard KS. Enhancer Function and Evolutionary Roles of Human Accelerated Regions. Annu Rev Genet. 2022;56(1):423–439. doi: 10.1146/annurev-genet-071819-103933. - DOI - PMC - PubMed
    1. Dutrow EV, Emera D, Yim K, Uebbing S, Kocher AA, Krenzer M, et al. Modeling uniquely human gene regulatory function via targeted humanization of the mouse genome. Nat Commun. 2022;13(1):304. doi: 10.1038/s41467-021-27899-w. - DOI - PMC - PubMed
    1. Aldea D, Atsuta Y, Kokalari B, Schaffner SF, Prasasya RD, Aharoni A, et al. Repeated mutation of a developmental enhancer contributed to human thermoregulatory evolution. Proc National Acad Sci. 2021;118(16):e2021722118. doi: 10.1073/pnas.2021722118. - DOI - PMC - PubMed
    1. Boyd JL, Skove SL, Rouanet JP, Pilaz LJ, Bepler T, Gordân R, et al. Human-Chimpanzee Differences in a FZD8 Enhancer Alter Cell-Cycle Dynamics in the Developing Neocortex. Curr Biol. 2015;25(6):772–779. doi: 10.1016/j.cub.2015.01.041. - DOI - PMC - PubMed

Publication types

LinkOut - more resources