. 2023 Sep;10(27):e2301058.

doi: 10.1002/advs.202301058. Epub 2023 Jul 28.

DeCOOC Deconvoluted Hi-C Map Characterizes the Chromatin Architecture of Cells in Physiologically Distinctive Tissues

Junmei Wang^{1

2}, Lu Lu^{3

4}, Shiqi Zheng^{1

2}, Danyang Wang^{1

2

5}, Long Jin^{3

4}, Qing Zhang¹, Mingzhou Li^{3

4}, Zhihua Zhang^{1

2}

Affiliations

¹ CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
² School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China.
³ Livestock and Poultry Multiomics Key Laboratory of Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China.
⁴ Animal Breeding and Genetics Key Laboratory of Sichuan Province, Institute of Animal Genetics and Breeding, Sichuan Agricultural University, Chengdu, 611130, China.
⁵ Sars-Fang Centre & MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, 266100, China.

PMID: 37515382
PMCID: PMC10520690
DOI: 10.1002/advs.202301058

DeCOOC Deconvoluted Hi-C Map Characterizes the Chromatin Architecture of Cells in Physiologically Distinctive Tissues

Junmei Wang et al. Adv Sci (Weinh). 2023 Sep.

. 2023 Sep;10(27):e2301058.

doi: 10.1002/advs.202301058. Epub 2023 Jul 28.

Authors

Junmei Wang^{1

2}, Lu Lu^{3

4}, Shiqi Zheng^{1

2}, Danyang Wang^{1

2

5}, Long Jin^{3

4}, Qing Zhang¹, Mingzhou Li^{3

4}, Zhihua Zhang^{1

2}

Affiliations

¹ CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
² School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China.
³ Livestock and Poultry Multiomics Key Laboratory of Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China.
⁴ Animal Breeding and Genetics Key Laboratory of Sichuan Province, Institute of Animal Genetics and Breeding, Sichuan Agricultural University, Chengdu, 611130, China.
⁵ Sars-Fang Centre & MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, 266100, China.

PMID: 37515382
PMCID: PMC10520690
DOI: 10.1002/advs.202301058

Abstract

Deciphering variations in chromosome conformations based on bulk three-dimensional (3D) genomic data from heterogenous tissues is a key to understanding cell-type specific genome architecture and dynamics. Surprisingly, computational deconvolution methods for high-throughput chromosome conformation capture (Hi-C) data remain very rare in the literature. Here, a deep convolutional neural network (CNN), deconvolve bulk Hi-C data (deCOOC) that remarkably outperformed all the state-of-the-art tools in the deconvolution task is developed. Interestingly, it is noticed that the chromatin accessibility or the Hi-C contact frequency alone is insufficient to explain the power of deCOOC, suggesting the existence of a latent embedded layer of information pertaining to the cell type specific 3D genome architecture. By applying deCOOC to in-house-generated bulk Hi-C data from visceral and subcutaneous adipose tissues, it is found that the characteristic chromatin features of M2 cells in the two anatomical loci are distinctively bound to different physiological functionalities. Taken together, deCOOC is both a reliable Hi-C data deconvolution method and a powerful tool for functional extraction of 3D genome architecture.

Keywords: bulk Hi-C; cell type compositions; computational deconvolution; deep learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Overview of the deCOOC model. A) Model architecture. The model consists of four convolution layers (each layer of the first three was coupled with one maxpooling layer) and two fully connected neural network layers and the last outputs layer. The input is two square‐like interaction matrices derived from the Hi‐C matrices of two different chromosomes. The last fully connected layer outputs the predicted cell type proportions. Model training and parameter optimization based on Hi‐C data were carried out by minimizing the sum of squares of residues between predicted cell fractions and ground‐truth cell fractions. B) Input design for the model. From the complete Hi‐C matrix (e.g., with a resolution of 500 kb) of one chromosome, multiple square‐like sub‐interaction matrices with fixed sizes (e.g., 30 bins) and steps (e.g., 20 bins) are derived diagonally. Two diagonal sub‐interaction matrices from two chromosomes are stitched together along the row axis.

**Figure 2**
deCOOC performs better (lower RMSE and higher CCC^[ ²⁷ ^]) on simulated mouse data than other methods. A) Boxplots of RMSE and CCC values over all test bulk samples from deCOOC and other deconvolution algorithms for the simulated mCC test dataset. B) Lineplots of RMSE and CCC values for each cell type. Each symbol represents the RMSE or CCC value between ground‐truth and predicted cell fractions for one cell type. C) Scatterplots of RMSE (CCC in bottom row) values and the number of Hi‐C contacts for simulated mouse data with deCOOC, ssKL, CDSeq, and CS. Pearson correlation coefficients and p values are given above the plots. Low RMSE and high CCC values represent good prediction performance of the method. For all algorithms, the number of test samples n = 363.

**Figure 3**
deCOOC behaves more robustly on simulated HFC data than the other methods. A) Boxplots of RMSE and CCC values over all test bulk samples from deCOOC and other deconvolution algorithms for the simulated HFC test dataset. B) Lineplots of RMSE and CCC values for each cell type. Each symbol represents the RMSE or CCC value between ground‐truth and predicted cell fractions for one cell type. C) Scatterplots of RMSE (CCC in bottom row) values and the number of Hi‐C interaction contacts of simulated HFC bulk data with deCOOC, CS, and DeconRNASeq. Pearson correlation coefficients and p values are given above the plots. For HFC data, the number of test samples n = 486.

**Figure 4**
SHAP analysis for model interpretation. A) Scatterplots show weak correlation between SHAP values and Hi‐C (observed/expected) for mCC and HFC examples (e.g., examples are the same as those shown in Figure 4C). Pearson correlation coefficients and p values are given above the plots. B) Correlation analysis between chromatin accessibility and SHAP values based on the HFC dataset. Chromatin accessibility was significantly higher in the group with higher SHAP values (i.e., (0.01, 0.1)) than in the other two groups for Astro and ODC cell types. The L23 cell type showed that chromatin openness only in the median SHAP values group (0.001, 0.01) was dramatically greater than that in the lower SHAP values group. The correlation between SHAP values and chromatin accessibility is dependent on different cell types. P values were calculated using a one‐sided Wilcoxon signed‐rank test. C) Examples of paired Hi‐C matrix (lower left) and SHAP value maps (upper right) for each cell type of mCC and HFC. The regions of 13–28 mb and 10–25 mb of the two chromosomes for the mCC and HFC bulk examples are shown. Cell types and chromosome numbers are labeled at the top and left of the plots, respectively, while the fraction of each cell type is presented in parentheses. For the HFC example, the fourteen cell types were sorted into four categories (labeled by four rectangles of different colors) according to the clustering of cell‐type specific chromatin interactions.^[ ⁶ ^] D) Heatmap of SHAP values (example shown in C) for each cell type prediction (left for mCC example, right for HFC example). Each row in the heatmap indicates the SHAP values for an interaction site. (Significant differences: *P < 0.05, **P < 0.01, ***P < 0.001).

**Figure 5**
deCOOC performs better than CS and CDseq (lower RMSE, but higher CCC) on pig tissue Hi‐C data. A) Boxplots of RMSE and CCC values from deCOOC, CS, and CDSeq for simulated pig bulk Hi‐C data (randomly sampled experimental Hi‐C contacts of four pig cell lines with artificially produced cell fractions). B) Boxplots of RMSE and CCC for assessing deconvolution performance of the three algorithms on real pig tissues. The deconvolution of CS and CDSeq was conducted on all 22 real adipose samples, and the deconvolution of deCOOC was performed five times on five samples randomly selected from 22 adipose samples (the remaining 17 samples were used to fine‐tune the model), which was performed five times. C) Scatterplots of CIBERSORTx‐predicted cell fractions (x‐axis) and deconvoluted cell fractions (y‐axis) from fine‐tuned deCOOC, CS, and CDSeq on real samples. The corresponding CCC values for the three methods are presented above the plots. D) RMSE and CCC values of deconvolution on five real test tissues (x‐axis) from the three deconvolution methods. deCOOC was fine‐tuned using different numbers of real tissues. E) Differential expression on a log 2 scale of five genes for subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT). (Significant differences: ****P < 0.0001).

See this image and copyright information in PMC

References

1. Gates M., Agency U. S., Miiro G., Serwanga J., Pozniak A., Mcphee D., Jaoko W., Dehovitz J., Bekker L. G., Pitisuttithum P., J. Virol. 2009, 83, 7337.
1. Goel V. Y., Hansen A. S., Wiley Interdiscip. Rev.: Dev. Biol. 2021, 10, e395. - PMC - PubMed
1. Ron G., Globerson Y., Moran D., Kaplan T., Nat. Commun. 2017, 8, 2237. - PMC - PubMed
1. Dekker J., Mirny L. A., Cell 2016, 164, 1110. - PMC - PubMed
1. Zheng H., Biol. Mood Anxiety Disord. 2019, 20, 535.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DeCOOC Deconvoluted Hi-C Map Characterizes the Chromatin Architecture of Cells in Physiologically Distinctive Tissues

Affiliations

DeCOOC Deconvoluted Hi-C Map Characterizes the Chromatin Architecture of Cells in Physiologically Distinctive Tissues

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources