Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 28:13:786501.
doi: 10.3389/fgene.2022.786501. eCollection 2022.

Comparison of Capture Hi-C Analytical Pipelines

Affiliations

Comparison of Capture Hi-C Analytical Pipelines

Dina Aljogol et al. Front Genet. .

Abstract

It is now evident that DNA forms an organized nuclear architecture, which is essential to maintain the structural and functional integrity of the genome. Chromatin organization can be systematically studied due to the recent boom in chromosome conformation capture technologies (e.g., 3C and its successors 4C, 5C and Hi-C), which is accompanied by the development of computational pipelines to identify biologically meaningful chromatin contacts in such data. However, not all tools are applicable to all experimental designs and all structural features. Capture Hi-C (CHi-C) is a method that uses an intermediate hybridization step to target and select predefined regions of interest in a Hi-C library, thereby increasing effective sequencing depth for those regions. It allows researchers to investigate fine chromatin structures at high resolution, for instance promoter-enhancer loops, but it introduces additional biases with the capture step, and therefore requires specialized pipelines. Here, we compare multiple analytical pipelines for CHi-C data analysis. We consider the effect of retaining multi-mapping reads and compare the efficiency of different statistical approaches in both identifying reproducible interactions and determining biologically significant interactions. At restriction fragment level resolution, the number of multi-mapping reads that could be rescued was negligible. The number of identified interactions varied widely, depending on the analytical method, indicating large differences in type I and type II error rates. The optimal pipeline depends on the project-specific tolerance level of false positive and false negative chromatin contacts.

Keywords: capture Hi-C; chromatin organization; computational pipeline; epigenetics; gene regulation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Research summary. (A). CHi-C analytical tools used and their sources. (B). Strategy overview. HiCUP and mHiC were compared for their performance in mapping read pairs and filtering experimental artefacts. GOTHiC, CHiCMaxima, CHiCAGO and CHiCANE were compared for their ability to identify reproducible, biologically relevant interactions. Yellow boxes indicate the type of background model used by each tool. GOTHiC local filtering (LF) is an optional downstream filtering of GOTHiC globally significant interactions based on the local interaction profile of each bait.
FIGURE 2
FIGURE 2
Valid read pairs. (A). Number of identified valid pairs using HiCUP and mHiC. Black bars indicate the total number of raw read pairs. Read pairs were filtered to keep only those with a mapping quality (MAPQ) ≥10. BF: number of mapped read pairs before MAPQ filtering. AF: number of mapped reads after MAPQ filtering. (B). Number of uniquely mapping read pairs using mHiC at different resolutions. (C). Number of rescued multi-mapping read pairs using mHiC at different resolutions. RF: restriction fragment.
FIGURE 3
FIGURE 3
Reproducibility of HiCUP-preprocessed significant interactions. (A). Number of HiCUP-preprocessed significant interactions using GOTHiC (with and without LF), CHiCAGO, CHiCANE and CHiCMaxima. (B). Bar plots represent the percentage of non-baited fragments that overlap in at least two replicates for each bait. Overlaps were studied for exact fragment-level interactions and interactions where non-baited fragment-ends were extended by 2.5 kb or 20 kb.
FIGURE 4
FIGURE 4
Biologically relevant interactions. Bubble heat maps represent the proportion of interactions that overlap (A). DHS or (B). H3K4me1 peaks. The size of the circle represents the log10 number of promoter-other interactions identified by the different methods. (C). A comparison between BCL2 promoter interactions in GM12878 cells within ± 1 Mb as reported by each method. Dashed lines represent examples of interactions that are identified in all tools (magenta), all tools except CHiCANE (blue), only in GOTHiC (yellow). ENCODE ChIP-seq profiles for DHS, H3K4me1 and H3K27ac in GM12878 cell lines are shown in the bottom tracks. DHS: DNase I Hypersensitivity sites.

References

    1. Akdemir K. C., Le V. T., Chandran S., Li Y., Verhaak R. G., Beroukhim R., et al. (2020). Disruption of Chromatin Folding Domains by Somatic Genomic Rearrangements in Human Cancer. Nat. Genet. 52 (3), 294–305. 10.1038/s41588-019-0564-y - DOI - PMC - PubMed
    1. Baxter J. S., Leavy O. C., Dryden N. H., Maguire S., Johnson N., Fedele V., et al. (2018). Capture Hi-C Identifies Putative Target Genes at 33 Breast Cancer Risk Loci. Nat. Commun. 9 (1), 1028. 10.1038/s41467-018-03411-9 - DOI - PMC - PubMed
    1. Cai Y., Zhang Y., Loh Y. P., Tng J. Q., Lim M. C., Cao Z., et al. (2021). "H3K27me3-rich Genomic Regions Can Function as Silencers to Repress Gene Expression via Chromatin interactions.". Nat. Commun. 12 (1), 1–22. 10.1038/s41467-021-20940-y - DOI - PMC - PubMed
    1. Cairns J., Freire-Pritchett P., Wingett S. W., Várnai C., Dimond A., Plagnol V., et al. (2016). CHiCAGO: Robust Detection of DNA Looping Interactions in Capture Hi-C Data. Genome Biol. 17 (1), 127. 10.1186/s13059-016-0992-2 - DOI - PMC - PubMed
    1. Consortium E. P. (2012). An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature 489 (7414), 57–74. 10.1038/nature11247 - DOI - PMC - PubMed

LinkOut - more resources