Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2026 Jan 20;22(1):e1013887.
doi: 10.1371/journal.pcbi.1013887. eCollection 2026 Jan.

A wavelet-based approach generates quantitative, scale-free and hierarchical descriptions of 3D genome structures and new biological insights

Affiliations

A wavelet-based approach generates quantitative, scale-free and hierarchical descriptions of 3D genome structures and new biological insights

Ryan Pellow et al. PLoS Comput Biol. .

Abstract

Eukaryotic genomes are organized within nuclei in three-dimensional space, forming structures such as loops, topologically associating domains (TADs), and chromosome territories. This 3D architecture impacts gene regulation and development, stress responses, and disease. However, current methods to infer these 3D structures from genomic data have multiple drawbacks, including varying outcomes depending on the resolution of the analysis and sequencing depth, qualitative outputs that limit statistical comparisons, and insufficient insight into structure frequency within samples. These challenges hinder rigorous comparisons of 3D properties across genomes, conditions, or species. To overcome these issues, we developed WaveTAD, a wavelet transform-based method that provides a resolution-free, probabilistic, and hierarchical description of 3D organization. WaveTAD generates TAD strengths, capturing the variable frequency of intrachromosomal interactions within samples, and shows increased accuracy and sensitivity over existing methods. We applied WaveTAD to multiple datasets from Drosophila, mouse, and humans to illustrate new biological insights that our more sensitive and quantitative approach provides, such as the widespread presence of embryonic 3D organization before zygotic genome activation, the effect of multiple CTCF units on the stability of loops and TADs, and the association between gene expression and TAD structures in COVID-19 patients or sex-specific transcription in Drosophila.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Overview of WaveTAD method.
Fig 2
Fig 2. Walkthrough of an application of WaveTAD.
(A) Contact matrices (1kb resolution) highlighting the loss of the loop structure when the kni tethering element is perturbed from Levo et al. [36] (B) Underlying coverage signal of the Hi-C matrix for the 5’ (top) and 3’ (bottom) contacts for the wildtype (left) and mutant (right) samples. (C) Detail coefficients (shown at two specified scales) generated from the application of wavelet-transforms to the coverage signal. (D) p-values corresponding to the generated detail coefficients. (E) Representative diagram of the “donut” algorithm used to identify both loops and TAD apexes. (F) Schematic of the diamond area algorithm overlayed over the control knrl-kni contact matrix. The diamond area algorithm is used to reconcile scenarios when WaveTAD calls the same TAD boundary at multiple scales with locations slightly moved (red box). (G) Boxplot comparing the change in boundary and loop p-values (see methods) between the wildtype and perturbed samples. See Methods for details.
Fig 3
Fig 3. Comparison of TAD calls between methods.
Box plots of TAD sizes called by each tool using different resolutions of contact matrices for the different species: fly (1kb, 5kb, 10kb, 25kb, 50kb), mouse (5kb, 10kb, 25kb, 50kb), and humans (10kb, 25kb, 50kb). Note that WaveTAD is resolution-free. (A) and (B) for non-hierarchical and hierarchical TAD callers, respectively.
Fig 4
Fig 4. Comparison of TAD calls between methods.
Concordance of TAD boundaries using the Jaccard Index across read depths (number of million mapped Hi-C contacts per megabase (Mc/Mb) in parentheses). The Jaccard Index was estimated relative to the highest contact read depth for each species: 75M (0.35 Mc/Mb) for flies, 250M (0.10 Mc/Mb) for mouse, and 1B (0.31 Mc/Mb) for human. A resolution of 10kb for fly and 25kb for mouse and human was used. (A) and (B) for non-hierarchical and hierarchical TAD callers, respectively.
Fig 5
Fig 5. Concordance of TADs called by various tools between biological replicates in Drosophila melanogaster.
Venn diagram depicting the number of TADs and number of overlapping TADs called between biological replicates. For each tool the Jaccard Index for all replicates is located below the Venn diagram. Data from Hug et al. staged embryos 3-4 hours post fertilization, biological replicates 1-3, (yellow, red, and blue, respectively) [37].
Fig 6
Fig 6. WaveTAD identifies TADs and boundary changes in heterogeneous samples.
(A) Contact matrices (10kb resolution) containing Hi-C contacts mixed between two Drosophila melanogaster cell lines. Shown is a region (3R:22,000,000-23,400,000) with two TADs (white arrows) unique to the S2 cell line (S2 TAD Left and S2 TAD Right). The pie chart above each contact matrix shows the ratio of reads used to build the matrix (1:0, 3:1, 1:1, 1:3, or 0:1 ratio of KC167 to S2, respectively). Below, lines indicate whether WaveTAD called the pair of TADs (green) or not (red). (B) Bar plots showing the TAD strength (-Log10(p)) of each of the two TADs highlighted in (A), across the mixed Hi-C contact samples. (C) Genome wide analysis of all TADs unique to the S2 cell line. Bar plot shows median TAD strengths (-Log10(p)) for the S2-unique TADs given the ratio of Hi-C contacts in the mixed samples (pie charts on the x-axis). For all analyses, contact matrices contained a total of 75 million Hi-C contacts.
Fig 7
Fig 7. Compartment sized TADs called by WaveTAD match high-resolution imaging.
(A,B) Percent of FISH-confirmed TADs called by different tools at 25kb and 50kb resolutions (WaveTAD is independent of resolution) (see Methods for details). (C) Polarization index based on the compartments confirmed by WaveTAD and a randomized control (Wilcoxon Paired Test). Analysis for chromosome 20 from FISH performed on 120 individual cells, where the red line represents the median and the blue boxes represent the 1st and 3rd quartiles. (D) Spatial map of the convex hulls for compartment-A (red) and compartment-B TADs (blue). Each TAD position (point) represents the centroid of the 120 individual cells. The spatial map was orientated by rotating the vector connecting the centroid of the convex hulls (A and B compartments) so that it was aligned to the y-axis. (E) An example of FISH-validated long-range contacts (blue and purple triangles) presented alongside the corresponding WaveTAD-identified TAD (blue line).
Fig 8
Fig 8. A majority of higher-order chromatin structures appear before zygotic genome activation (ZGA) in both fly and mouse.
(A) Contact matrices (10kb resolution) of Drosophila melanogaster (chr3R:12,400,000-13,600,000) over four developmental timepoints (nc12, nc13, nc14, and 3-4 hours post fertilization, hpf) and the KC167 cell line are overlayed with TADs called by WaveTAD. The red lines and corresponding percentages (in red) show the genome wide concordance of TAD boundaries for each timepoint (nc12, nc13 and nc14) relative to the 3-4 hpf timepoint using the Jaccard Index. The dark blue lines and corresponding percentages show the genome wide concordance of TAD boundaries using the Jaccard Index for each timepoint (nc12, nc13, nc14, and 3-4 hpf, respectively) relative to the KC167 cell line. Note that zygotic genome activation occurs at nc14. (B) Contact matrices (10kb resolution) of Mus musculus (chr6:51,000,000-53,000,000) over five developmental timepoints (PN5, early 2-cell, late 2-cell, 8-cell, and ICM) and the mESC cell line are overlayed with TADs called by WaveTAD. The red lines and corresponding percentages show the genome wide concordance of TAD boundaries using the Jaccard Index for each timepoint (PN5, early 2-cell, late 2-cell, and 8-cell, respectively) relative to the ICM timepoint. The dark blue lines and corresponding percentages show the concordance of TAD boundaries using the Jaccard Index for each timepoint (PN5, early 2-cell, late 2-cell, 8-cell, and ICM, respectively) relative to the mESC cell line. Note that zygotic genome activation in mouse occurs at the 2-cell stage.
Fig 9
Fig 9. Complex CTCF sites produce more stable TADs.
Bar plot showing the average TAD strength (-Log10(p)) for six classes of TAD boundaries: 1) TADs that contain no CTCF sites at either TAD boundary, 2) TADs with one boundary containing no CTCF sites and the other containing one or more CTCF sites, 3) TADs with both boundaries containing one or more CTCF sites all sharing the same orientation, 4) TADs with both boundaries containing one or more CTCF sites in a convergent orientation, 5) TADs with both boundaries containing one or more CTCF sites in a divergent orientation, and 6) TADs where each boundary contains one or more CTCF sites with at least one of the boundaries showing multiple orientations. The average TAD strengths of each class were compared to the non-CTCF class and p-values were generated by bootstrapping.
Fig 10
Fig 10. SARS-CoV-2 infection disrupts pathway-specific DNA loops.
(A) Merged contact matrices of TADs containing olfactory transduction genes at one or both of their boundaries. Contact matrices of TADs were normalized to a randomized set of TADs with similar TAD size to control for the power-law decay observed in Hi-C data. Areas of greater than expected contact densities are shades of red, whereas less than expected contacts densities are shades of blue. (B) Box plot of the change (Log10) in strength of boundaries, TAD apexes/loop anchors and TADs for TADs/loops that contain at least one olfactory transduction gene (blue). As a comparison, all genome wide TADs that contain at least one gene at their boundaries are also plotted (grey). (C) Box plot of the change in gene expression levels (Log2-fold) of genes located at TAD boundaries for TADs that are either shared and lost between the control and COVID-19 samples. The box plot is further classified into shared and lost TADs that contain an olfactory transduction gene at one of their boundaries (blue) and TADs that contain any gene at one of their TAD boundaries (grey). All p-values were generated by bootstrapping.
Fig 11
Fig 11. Sex-specific genes and TAD stability in Drosophila.
(A) Merged contact matrices of TADs containing male-specific genes at one or both of their boundaries. Contact matrices of TADs were normalized to a randomized set of TADs with similar TAD size to control for the power-law decay observed in Hi-C data. Areas of greater than expected contact densities are shades of red, whereas less than expected contacts densities are shades of blue. Box plot between male and female contact matrices shows average TAD boundary and loop strengths (-Log10(p)) of TADs containing male-specific genes for the male and female cell lines (blue and red, respectively). (B) Merged contact matrices of TADs containing non-biased genes at one or both of their boundaries. Contact matrices were normalized and depicted as in (A). Box plot between male and female contact matrices shows the average TAD boundary and loop strengths of the TADs containing non-biased genes for the male and female cell lines (blue and red, respectively).

References

    1. Bonev B, Cavalli G. Organization and function of the 3D genome. Nat Rev Genet. 2016;17(11):661–78. doi: 10.1038/nrg.2016.112 - DOI - PubMed
    1. Szabo Q, Bantignies F, Cavalli G. Principles of genome folding into topologically associating domains. Sci Adv. 2019;5(4):eaaw1668. doi: 10.1126/sciadv.aaw1668 - DOI - PMC - PubMed
    1. Oldridge DA, Wood AC, Weichert-Leahey N, Crimmins I, Sussman R, Winter C, et al. Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism. Nature. 2015;528(7582):418–21. doi: 10.1038/nature15540 - DOI - PMC - PubMed
    1. Flavahan WA, Drier Y, Liau BB, Gillespie SM, Venteicher AS, Stemmer-Rachamimov AO, et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature. 2016;529(7584):110–4. doi: 10.1038/nature16490 - DOI - PMC - PubMed
    1. Ji X, Dadon DB, Powell BE, Fan ZP, Borges-Rivera D, Shachar S, et al. 3D chromosome regulatory landscape of human pluripotent cells. Cell Stem Cell. 2016;18(2):262–75. doi: 10.1016/j.stem.2015.11.007 - DOI - PMC - PubMed

LinkOut - more resources