Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

doi:10.1186/s40246-022-00375-2

. 2022 Jan 11;16(1):2.

doi: 10.1186/s40246-022-00375-2.

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Kim Philipp Jablonski^{1

2}, Leopold Carron^{3

4}, Julien Mozziconacci^{3

5}, Thierry Forné⁶, Marc-Thorsten Hütt⁷, Annick Lesne^{8

9}

Affiliations

¹ Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland.
² SIB Swiss Institute of Bioinformatics, 4058, Basel, Switzerland.
³ Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, CNRS, Sorbonne Université, Paris, France.
⁴ Laboratory of Computational and Quantitative Biology, LCQB, Sorbonne Université, Paris, France.
⁵ Structure et Instabilité des Génomes, Muséum National d'Histoire Naturelle, Paris, France.
⁶ Institut de Génétique Moléculaire de Montpellier, IGMM, CNRS, Univ. Montpellier, Montpellier, France.
⁷ Department of Life Sciences and Chemistry, Jacobs University Bremen, Bremen, Germany. m.huett@jacobs-university.de.
⁸ Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, CNRS, Sorbonne Université, Paris, France. annick.lesne@igmm.cnrs.fr.
⁹ Institut de Génétique Moléculaire de Montpellier, IGMM, CNRS, Univ. Montpellier, Montpellier, France. annick.lesne@igmm.cnrs.fr.

PMID: 35016721
PMCID: PMC8753905
DOI: 10.1186/s40246-022-00375-2

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Kim Philipp Jablonski et al. Hum Genomics. 2022.

. 2022 Jan 11;16(1):2.

doi: 10.1186/s40246-022-00375-2.

Authors

Kim Philipp Jablonski^{1

2}, Leopold Carron^{3

4}, Julien Mozziconacci^{3

5}, Thierry Forné⁶, Marc-Thorsten Hütt⁷, Annick Lesne^{8

9}

Affiliations

¹ Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland.
² SIB Swiss Institute of Bioinformatics, 4058, Basel, Switzerland.
³ Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, CNRS, Sorbonne Université, Paris, France.
⁴ Laboratory of Computational and Quantitative Biology, LCQB, Sorbonne Université, Paris, France.
⁵ Structure et Instabilité des Génomes, Muséum National d'Histoire Naturelle, Paris, France.
⁶ Institut de Génétique Moléculaire de Montpellier, IGMM, CNRS, Univ. Montpellier, Montpellier, France.
⁷ Department of Life Sciences and Chemistry, Jacobs University Bremen, Bremen, Germany. m.huett@jacobs-university.de.
⁸ Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, CNRS, Sorbonne Université, Paris, France. annick.lesne@igmm.cnrs.fr.
⁹ Institut de Génétique Moléculaire de Montpellier, IGMM, CNRS, Univ. Montpellier, Montpellier, France. annick.lesne@igmm.cnrs.fr.

PMID: 35016721
PMCID: PMC8753905
DOI: 10.1186/s40246-022-00375-2

Abstract

Background: Genome-wide association studies have identified statistical associations between various diseases, including cancers, and a large number of single-nucleotide polymorphisms (SNPs). However, they provide no direct explanation of the mechanisms underlying the association. Based on the recent discovery that changes in three-dimensional genome organization may have functional consequences on gene regulation favoring diseases, we investigated systematically the genome-wide distribution of disease-associated SNPs with respect to a specific feature of 3D genome organization: topologically associating domains (TADs) and their borders.

Results: For each of 449 diseases, we tested whether the associated SNPs are present in TAD borders more often than observed by chance, where chance (i.e., the null model in statistical terms) corresponds to the same number of pointwise loci drawn at random either in the entire genome, or in the entire set of disease-associated SNPs listed in the GWAS catalog. Our analysis shows that a fraction of diseases displays such a preferential localization of their risk loci. Moreover, cancers are relatively more frequent among these diseases, and this predominance is generally enhanced when considering only intergenic SNPs. The structure of SNP-based diseasome networks confirms that localization of risk loci in TAD borders differs between cancers and non-cancer diseases. Furthermore, different TAD border enrichments are observed in embryonic stem cells and differentiated cells, consistent with changes in topological domains along embryogenesis and delineating their contribution to disease risk.

Conclusions: Our results suggest that, for certain diseases, part of the genetic risk lies in a local genetic variation affecting the genome partitioning in topologically insulated domains. Investigating this possible contribution to genetic risk is particularly relevant in cancers. This study thus opens a way of interpreting genome-wide association studies, by distinguishing two types of disease-associated SNPs: one with an effect on an individual gene, the other acting in interplay with 3D genome organization.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

**Fig. 1**
Data structure behind the investigation. Underlying Hi-C data are displayed as a heat map (the redder the more contacts, as indicated in the color bar), here for a region of chromosome 11 (chr11: 123,050,000–123,550,000, hg19 coordinates), drawn from data published in [30], for IMR90 cell type, at 10 kb-resolution. TADs are underlined with black triangle lines. They are determined using TopDom algorithm [31], which identifies a demarcation between two TADs as a local minimum of the number of contacts in a sliding window of half-size k bins, where k is a tunable parameter (blue diamonds, the full-lined one corresponding to the limit of a TAD). TAD borders are defined as 20 kb-regions from TAD ends inward, and underlined here as small triangles filled in gray. Vertical lines pinpoint disease-associated SNPs located in TAD borders (full line for cancer-associated SNPs, dashed lines for SNPs associated with non-cancer diseases). For each cancer or each non-cancer disease, we investigate the potential over-representation of its associated SNPs in TAD borders. The dashed white arrow indicates increased physical contacts and increased regulatory interactions between adjacent TADs that would possibly appear in a border affected by the presence of an at-risk SNP allele

**Fig. 2**
Preferential location in TAD borders of the SNPs associated to a disease. A Normalized histogram of enrichment p-values (corrected for multiple testing and plotted as [− log₁₀ p] on the horizontal axis) for cancers (yellow bars) and non-cancer diseases (blue bars, overlap in gray), considering the contribution of all SNPs to a potential TAD-border enrichment. The results have been aggregated over datasets for different cell types from [30] at 10 kb-resolution and all considered values of the window parameter k of TopDom algorithm. The two histograms have been normalized separately. The dashed red line indicates the significance threshold at p* = 0.05. B Same as (A) considering only intergenic daSNPs in the hypergeometric enrichment test. C Difference between the cancer histogram, in orange, and the histogram for non-cancer diseases in (A), showing that relatively more cancers display a preferential location of their associated SNPs in TAD borders. D Same as (C) for the histograms in (B), showing that the relative dominance of cancers, among diseases displaying TAD-border enrichment, is enhanced when considering only intergenic daSNPs

**Fig. 3**
Preferential location of daSNPs in TAD borders occurs relatively more often for cancers. A comparison of the fraction of cancers (orange) and fraction of non-cancer diseases (blue) displaying TAD-border enrichment is presented for various filters on the GWAS catalog: considering all daSNPs, then only exonic, intronic, or intergenic daSNPs. Only diseases displaying enrichment for a majority of values of the TopDom window-size parameter k are included (see Methods). A Aggregation over five cell types (six datasets) using data at 10 kb-resolution from [30]. B–G Detailed comparison for each of the six datasets, where the cell type is indicated above each panel, together with the restriction enzyme (DpnII or MboI) used in the Hi-C experiment. Stars indicate when the difference between cancers and non-cancer diseases is statistically significant (Fisher exact test, *p $\leq$ 0.05, **p $\leq$ 0.01, ****p $\leq$ 0.0001). Analyses for data from [, –38] are presented in Additional file 1: Fig. S7

**Fig. 4**
Comparison of enrichment results for IMR90 cells and embryonic cells. Same as Fig. 3, now comparing three datasets for IMR90 cell line (left column, panels C, D and E) with two datasets for embryonic stem cells (hESC, panels F and G). The two panels at the top display the results aggregated over the individual datasets, for IMR90 cells (pink background, panel A) and embryonic stem cells (blue background, panel B), respectively

**Fig. 5**
Clustering of cancers and non-cancer diseases sharing daSNPs. In this network representation, red (resp. violet) nodes correspond to cancers (resp. non-cancer diseases) displaying TAD-border enrichment in daSNPs, orange (resp. light blue) nodes to other cancers (resp. other non-cancer diseases). An edge is drawn between two diseases when they share an associated SNP A located in a TAD border for a majority of values of the window parameter k (border SNPs, see Methods) or B belonging to the complementary set (non-border SNPs). C, D Same as (A, B) when considering only intergenic SNPs. The four networks have been visualized using *NetworkX* Python package. Underlying Hi-C data from [30], IMR90 cell type, 10 kb-resolution

See this image and copyright information in PMC

Cited by

Epigenetics Role in Spermatozoa Function: Implications in Health and Evolution-An Overview.
Andreu-Noguera J, López-Botella A, Sáez-Espinosa P, Gómez-Torres MJ. Andreu-Noguera J, et al. Life (Basel). 2023 Jan 29;13(2):364. doi: 10.3390/life13020364. Life (Basel). 2023. PMID: 36836724 Free PMC article. Review.
CTCF: A misguided jack-of-all-trades in cancer cells.
Segueni J, Noordermeer D. Segueni J, et al. Comput Struct Biotechnol J. 2022 May 27;20:2685-2698. doi: 10.1016/j.csbj.2022.05.044. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35685367 Free PMC article. Review.
Network location and clustering of genetic mutations determine chronicity in a stylized model of genetic diseases.
Nyczka P, Falk J, Hütt MT. Nyczka P, et al. Sci Rep. 2022 Nov 19;12(1):19906. doi: 10.1038/s41598-022-23775-9. Sci Rep. 2022. PMID: 36402799 Free PMC article.
Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements.
Iñiguez-Muñoz S, Llinàs-Arias P, Ensenyat-Mendez M, Bedoya-López AF, Orozco JIJ, Cortés J, Roy A, Forsberg-Nilsson K, DiNome ML, Marzese DM. Iñiguez-Muñoz S, et al. Cell Mol Life Sci. 2024 Jun 20;81(1):274. doi: 10.1007/s00018-024-05314-z. Cell Mol Life Sci. 2024. PMID: 38902506 Free PMC article. Review.
3D genome alterations in T cells associated with disease activity of systemic lupus erythematosus.
Zhao M, Feng D, Hu L, Liu L, Wu J, Hu Z, Long H, Kuang Q, Ouyang L, Lu Q. Zhao M, et al. Ann Rheum Dis. 2023 Feb;82(2):226-234. doi: 10.1136/ard-2022-222653. Epub 2022 Sep 1. Ann Rheum Dis. 2023. PMID: 36690410 Free PMC article.

References

1. Hardy J, Singleton A. Genome wide association studies and human disease. New Engl J Med. 2009;360:1759–1768. - PMC - PubMed
1. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101:5–22. - PMC - PubMed
1. Erichsen HC, Chanock SJ. SNPs in cancer research and treatment. Brit J Cancer. 2004;90:747–751. - PMC - PubMed
1. McKay JD, Hung R, Han Y, Zong X, Carreras-Torres R, Christiani D, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet. 2017;49:1126–1132. - PMC - PubMed
1. Maurano MT, Humbert R, Thurman RE, Haugen E, Wang H, Reynolds AP, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

[1] Hardy J, Singleton A. Genome wide association studies and human disease. New Engl J Med. 2009;360:1759–1768. - PMC - PubMed

[2] Hardy J, Singleton A. Genome wide association studies and human disease. New Engl J Med. 2009;360:1759–1768. - PMC - PubMed

[3] Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101:5–22. - PMC - PubMed

[4] Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101:5–22. - PMC - PubMed

[5] Erichsen HC, Chanock SJ. SNPs in cancer research and treatment. Brit J Cancer. 2004;90:747–751. - PMC - PubMed

[6] Erichsen HC, Chanock SJ. SNPs in cancer research and treatment. Brit J Cancer. 2004;90:747–751. - PMC - PubMed

[7] McKay JD, Hung R, Han Y, Zong X, Carreras-Torres R, Christiani D, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet. 2017;49:1126–1132. - PMC - PubMed

[8] McKay JD, Hung R, Han Y, Zong X, Carreras-Torres R, Christiani D, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet. 2017;49:1126–1132. - PMC - PubMed

[9] Maurano MT, Humbert R, Thurman RE, Haugen E, Wang H, Reynolds AP, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. - PMC - PubMed

[10] Maurano MT, Humbert R, Thurman RE, Haugen E, Wang H, Reynolds AP, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Affiliations

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical