Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 5;8(1):441.
doi: 10.1038/s41467-017-00524-5.

Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation

Affiliations

Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation

Nathan Harmston et al. Nat Commun. .

Abstract

Developmental genes in metazoan genomes are surrounded by dense clusters of conserved noncoding elements (CNEs). CNEs exhibit unexplained extreme levels of sequence conservation, with many acting as developmental long-range enhancers. Clusters of CNEs define the span of regulatory inputs for many important developmental regulators and have been described previously as genomic regulatory blocks (GRBs). Their function and distribution around important regulatory genes raises the question of how they relate to 3D conformation of these loci. Here, we show that clusters of CNEs strongly coincide with topological organisation, predicting the boundaries of hundreds of topologically associating domains (TADs) in human and Drosophila. The set of TADs that are associated with high levels of noncoding conservation exhibit distinct properties compared to TADs devoid of extreme noncoding conservation. The close correspondence between extreme noncoding conservation and TADs suggests that these TADs are ancient, revealing a regulatory architecture conserved over hundreds of millions of years.Metazoan genomes contain many clusters of conserved noncoding elements. Here, the authors provide evidence that these clusters coincide with distinct topologically associating domains in humans and Drosophila, revealing a conserved regulatory genomic architecture.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Fig. 1
Fig. 1
The boundaries of GRBs are highly consistent regardless of the thresholds or species involved. a The human MEIS1 locus is spanned by arrays of conserved noncoding elements (CNEs) identified in comparisons with opossum, chicken and spotted gar. These CNEs can be visualised as a smoothed density, shown here as a horizon plot. The boundaries of the proposed GRBs at the MEIS1 locus are highly consistent regardless of the species or thresholds involved. b All hg19-galGal4 GRBs centred and ordered by length of the GRB. c Distribution of CNEs in a window of 8 Mb around the centre of the hg19-galGal4 GRBs for different sets of CNEs. d Overlap of putative GRBs obtained using comparisons between hg19-monDom5 and hg19-lepOcu1 with these sets of GRBs identified using CNEs using hg19-galGal4 (70%/50 bp)
Fig. 2
Fig. 2
The boundaries of GRBs predict the boundaries of TADs in multiple evolutionarily distant species. a Heatmaps representing H1-ESC directionality index spanning an 8 Mb window around the centre of putative hg19-galGal4 GRBs. Showing both the overall direction (middle panel, red for downstream, blue for upstream) and the average raw directionality score in 5 kb bins (right panel). b Heatmaps of Drosophila embryo Hi-C directionality index spanning a 2 Mb window around the centre of dm3-droMoj3 GRBs. Showing both the overall direction (middle panel, red for downstream, blue for upstream) and the average raw directionality score in 1 kb bins (right panel). c A large number of GRBs were found to be located within individual TADs (identified using HOMER) or overlapping only a single TAD, regardless of cell lineage (H1-ESC (H1), mesenchymal stem cells (MS), mesendoderm (ME), neural progenitor cells (NP) and trophoblast-like (TB)). d Cumulative distribution of distance to nearest TAD (HOMER) boundaries from GRB boundaries in different cell lineages considering both edges, i.e., both the start and end positions of a GRB lie within X kb of the nearest TAD start and end
Fig. 3
Fig. 3
Examples of genomic regulatory blocks and their associated interaction landscapes in human. GRBs at several human loci show strong association with the structure of regulatory domains proposed from Hi-C. a The GRB containing MEIS1 (chr2:65270920-68723490) accurately predicts the span of regulatory interactions defined by Hi-C. b The region located at chr6-44198640-46071520 contains both the transcription factor RUNX2 and its bystander gene SUP3TH (shown in brown), both of which are located within a GRB which predicts the topological organisation of the locus. c A region located on chr16:48476700-55776880 in hg19 contains several GRBs containing important developmental regulators, including IRX3/5/6, TOX3, SALL1, NKD1 and ZNF423, which exhibit strong concordance with TADs. The IRX3/5/6 locus contains homeobox proteins which have multiple functions during animal development and contains a well-known bystander gene FTO (shown in brown), which contains an intronic enhancer which drives expression of IRX3
Fig. 4
Fig. 4
Several sets of features distinguish between TADs associated with extreme conservation from those without. a Depletion of SINE elements within GRB-TADs compared to non-GRB-TADs (using H1 HOMER TADs) reflects the selective constraint on these regions against the retention of repeat element insertion. b GRBs in Drosophila Kc167 cells are mainly associated with inactive (black) and Polycomb-repressed chromatin (blue) and represent functionally coherent regions. Active and regulated chromatin correspond to different types of euchromatin. There appears to be a change in the proportion of constitutively active chromatin (yellow) at the boundary regions identified as GRBs. c CTCF sites are depleted within GRBs. CTCF sites per 10 kb plotted across GRBs and flanking regions of equivalent size to the GRBs normalised to show signal relative to GRB boundaries and Loess-smoothed. d Enrichment for different patterns of CTCF binding at different genomic features (Specific, Intermediate, Constitutive). Constitutive CTCF peaks are enriched within 10 kb of both GRBs and GRB-TAD boundaries (binomial test p-values: *p < 0.05, **p < 0.01, ***p < 0.001). e Distribution of TAD width reveals GRB-TADs are significantly longer than non-GRB-TADs identified in human H1 cells using either HOMER (median width 620 kb vs. 460 kbp, p < 1e−6) or HMM_calls (median width 920 vs. 680 kb, p < 1e−6). f Human H1 GRB-TADs are associated with lower protein-coding gene density than non-GRB-TADs identified using HOMER (median no. genes 2.63 vs. 8.33 p < 1e–6) or HMM_calls (median no. of genes 2.65 vs. 8.33 p < 1e−6). g GRB-TADs are significantly stronger than non-GRB-TADs identified in human H1 cells using HOMER (median strength 90.88 vs. 75.81, p < 1e−6) or HMM_calls (median strength 91.77 vs. 77.68, p < 1e−6). h GRB-TADs (HOMER) are preferentially associated with compartment B in all of the lineages investigated. i GRB-TADs are more likely to switch compartment in at least one of the five lineages investigated (i.e., A–B or B–A) than non-GRB-TADs
Fig. 5
Fig. 5
The span of CNEs in various species is predictive of genome size and TAD size. a Distribution of genome size, TAD size and GRB size in opossum, humans, mouse, chicken, spotted gar and Drosophila. b Sizes of 17 stringent GRBs identified in various species shows direct expansion and growth in line with that expected given differences in genome size. c Correlation of sizes of GRBs identified in one species compared to other species and to the corresponding TADs in humans finds that the size of a GRB identified in other species is highly predictive of the size and scope of regulatory domains in other species

References

    1. Sandelin A, et al. Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics. 2004;5:99. doi: 10.1186/1471-2164-5-99. - DOI - PMC - PubMed
    1. Woolfe A, et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 2005;3:e7. doi: 10.1371/journal.pbio.0030007. - DOI - PMC - PubMed
    1. Engström PG, Ho Sui SJ, Drivenes O, Becker TS, Lenhard B. Genomic regulatory blocks underlie extensive microsynteny conservation in insects. Genome Res. 2007;17:1898–1908. doi: 10.1101/gr.6669607. - DOI - PMC - PubMed
    1. Bejerano G, et al. Ultraconserved elements in the human genome. Science. 2004;304:1321–1325. doi: 10.1126/science.1098119. - DOI - PubMed
    1. Navratilova P, et al. Systematic human/zebrafish comparative identification of cis-regulatory activity around vertebrate developmental transcription factor genes. Dev. Biol. 2009;327:526–540. doi: 10.1016/j.ydbio.2008.10.044. - DOI - PubMed

Publication types

Substances