Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 1;36(12):3645-3651.
doi: 10.1093/bioinformatics/btaa249.

Serpentine: a flexible 2D binning method for differential Hi-C analysis

Affiliations

Serpentine: a flexible 2D binning method for differential Hi-C analysis

Lyam Baudry et al. Bioinformatics. .

Abstract

Motivation: Hi-C contact maps reflect the relative contact frequencies between pairs of genomic loci, quantified through deep sequencing. Differential analyses of these maps enable downstream biological interpretations. However, the multi-fractal nature of the chromatin polymer inside the cellular envelope results in contact frequency values spanning several orders of magnitude: contacts between loci pairs separated by large genomic distances are much sparser than closer pairs. The same is true for poorly covered regions, such as repeated sequences. Both distant and poorly covered regions translate into low signal-to-noise ratios. There is no clear consensus to address this limitation.

Results: We present Serpentine, a fast, flexible procedure operating on raw data, which considers the contacts in each region of a contact map. Binning is performed only when necessary on noisy regions, preserving informative ones. This results in high-quality, low-noise contact maps that can be conveniently visualized for rigorous comparative analyses.

Availability and implementation: Serpentine is available on the PyPI repository and https://github.com/koszullab/serpentine; documentation and tutorials are provided at https://serpentine.readthedocs.io/en/latest/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Contact matrix and algorithm. (A) Hi-C matrix (S.cerevisiae chromosome V). Since contact matrices are symmetric, only one half is shown. Each pixel of the map corresponds to a pair of coordinates of genomic segments (or bins). The color intensity reflects the frequency of contacts. The horizontal blue line corresponds to one of the diagonals of the contact matrix, with i indicating the distance separating the pairs of segments positioned along that diagonal. At high i values, i.e. for DNA segments separated by large distances, the matrix becomes sparse (white pixels). The lower panel shows the mean and CV for all diagonals (i.e. i values) of the matrix. The resulting scatter-plot reveals a transition that defines two constants: the CV constant CV1 (green line, for μ>t) and the mean constant t (dotted line, intersection of green line and the Poisson distribution represented by the red line). Dots at the left of t and over CV1 are subject to the effect of sparsity. (B) Algorithm flowchart of serpentine. See main text for detailed description of the workflow. (Color version of this figure is available at Bioinformatics online.)
Fig. 2.
Fig. 2.
Serpentine analysis using biological replicates of asynchronous yeast cultures. Contact maps of chromosome V, ratio heatmaps and MD plots before (A) (bin =2.5 kb) and after (B), serpentine binning. On the left panels, half of the symmetric matrices are depicted: top right for replicate 1 and bottom left for replicate 2. Center panels show the full symmetric ratio heatmap of the two matrices from the corresponding left panel. See the M&M section for MD plot details
Fig. 3.
Fig. 3.
Serpentine analysis comparing contact maps generated during meiosis. Time course at 0 and 4 h, before (A) and after (B) serpentine binning, left, right and central panels are the same as in Figure 2. (C) Serpentine analysis comparing the Rec8 cohesin deletion mutant to control, in a population blocked in pachytene through Ndt80 deletion: before binning, after binning and ratio-maps. The green rectangles along the axis of contact maps correspond to the deposition sites of the cohesin meiotic subunit Rec8. The green lines connect these deposition sites to the diagonal. Green arrow: telomere-telomere contacts. Red arrow: centromere position. Cyan arrow, meiotic loop. (Color version of this figure is available at Bioinformatics online.)
Fig. 4.
Fig. 4.
HiCcompare analysis of ratio heatmaps before (bottom left) or after (top right) serpentine binning. Maps and masks obtained using the data from Figure 2 (A) (biological replicates, asynchronous cells) or from Figure 3 (B) (meiosis t=0 and 4 h cells) (C) (pachytene-arrested cells rec8Δ and control). A semi-transparent mask derived from HiCcompare analysis overlays the ratio heatmaps, which highlights the significant ratio determined by HiCcompare (no dark areas)
Fig. 5.
Fig. 5.
Serpentine analysis on low-coverage data. Same as Figure 3 middle panels, but using the same data downsampled to reduce the global coverage. The percentage of the total initial count kept after down-sampling is indicated below each panel
Fig. 6.
Fig. 6.
Serpentine applied to mammalian data from (Patel et al., 2019). Comparison of mouse spermatocytes during zygotene stage of meiosis versus control (mice interphase). Data from mouse chromosome II, region 27 700–29 160 (1.46 Mb, 1 kb bins) are depicted. See the previous figures for details
Fig. 7.
Fig. 7.
Performances. (A) Influence of ICE iterative normalization (Imakaev et al., 2012), with or without filtering, on Serpentine outcomes. (B) Comparison between log-ratios for binless (left) and serpentine (right), using the binless-provided datasets (IMR90 versus GM12878) for the locus FOXP1 (Chromosome 3, region 7–7.2 Mb)

References

    1. Cournac A. et al. (2012) Normalization of a chromosomal contact map. BMC Genomics, 13, 436. - PMC - PubMed
    1. Dekker J. et al. (2002) Capturing chromosome conformation. Science, 295, 1306–1311. - PubMed
    1. Forcato M. et al. (2017) Comparison of computational methods for Hi-C data analysis. Nat. Methods, 14, 679–685. - PMC - PubMed
    1. Imakaev M. et al. (2012) Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999–1003. - PMC - PubMed
    1. Imakaev M.V. et al. (2015) Modeling chromosomes: beyond pretty pictures. FEBS Lett., 589, 3031–3036. - PMC - PubMed

Publication types

LinkOut - more resources