Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 11;16(1):2.
doi: 10.1186/s40246-022-00375-2.

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Affiliations

Contribution of 3D genome topological domains to genetic risk of cancers: a genome-wide computational study

Kim Philipp Jablonski et al. Hum Genomics. .

Abstract

Background: Genome-wide association studies have identified statistical associations between various diseases, including cancers, and a large number of single-nucleotide polymorphisms (SNPs). However, they provide no direct explanation of the mechanisms underlying the association. Based on the recent discovery that changes in three-dimensional genome organization may have functional consequences on gene regulation favoring diseases, we investigated systematically the genome-wide distribution of disease-associated SNPs with respect to a specific feature of 3D genome organization: topologically associating domains (TADs) and their borders.

Results: For each of 449 diseases, we tested whether the associated SNPs are present in TAD borders more often than observed by chance, where chance (i.e., the null model in statistical terms) corresponds to the same number of pointwise loci drawn at random either in the entire genome, or in the entire set of disease-associated SNPs listed in the GWAS catalog. Our analysis shows that a fraction of diseases displays such a preferential localization of their risk loci. Moreover, cancers are relatively more frequent among these diseases, and this predominance is generally enhanced when considering only intergenic SNPs. The structure of SNP-based diseasome networks confirms that localization of risk loci in TAD borders differs between cancers and non-cancer diseases. Furthermore, different TAD border enrichments are observed in embryonic stem cells and differentiated cells, consistent with changes in topological domains along embryogenesis and delineating their contribution to disease risk.

Conclusions: Our results suggest that, for certain diseases, part of the genetic risk lies in a local genetic variation affecting the genome partitioning in topologically insulated domains. Investigating this possible contribution to genetic risk is particularly relevant in cancers. This study thus opens a way of interpreting genome-wide association studies, by distinguishing two types of disease-associated SNPs: one with an effect on an individual gene, the other acting in interplay with 3D genome organization.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Data structure behind the investigation. Underlying Hi-C data are displayed as a heat map (the redder the more contacts, as indicated in the color bar), here for a region of chromosome 11 (chr11: 123,050,000–123,550,000, hg19 coordinates), drawn from data published in [30], for IMR90 cell type, at 10 kb-resolution. TADs are underlined with black triangle lines. They are determined using TopDom algorithm [31], which identifies a demarcation between two TADs as a local minimum of the number of contacts in a sliding window of half-size k bins, where k is a tunable parameter (blue diamonds, the full-lined one corresponding to the limit of a TAD). TAD borders are defined as 20 kb-regions from TAD ends inward, and underlined here as small triangles filled in gray. Vertical lines pinpoint disease-associated SNPs located in TAD borders (full line for cancer-associated SNPs, dashed lines for SNPs associated with non-cancer diseases). For each cancer or each non-cancer disease, we investigate the potential over-representation of its associated SNPs in TAD borders. The dashed white arrow indicates increased physical contacts and increased regulatory interactions between adjacent TADs that would possibly appear in a border affected by the presence of an at-risk SNP allele
Fig. 2
Fig. 2
Preferential location in TAD borders of the SNPs associated to a disease. A Normalized histogram of enrichment p-values (corrected for multiple testing and plotted as [− log10 p] on the horizontal axis) for cancers (yellow bars) and non-cancer diseases (blue bars, overlap in gray), considering the contribution of all SNPs to a potential TAD-border enrichment. The results have been aggregated over datasets for different cell types from [30] at 10 kb-resolution and all considered values of the window parameter k of TopDom algorithm. The two histograms have been normalized separately. The dashed red line indicates the significance threshold at p* = 0.05. B Same as (A) considering only intergenic daSNPs in the hypergeometric enrichment test. C Difference between the cancer histogram, in orange, and the histogram for non-cancer diseases in (A), showing that relatively more cancers display a preferential location of their associated SNPs in TAD borders. D Same as (C) for the histograms in (B), showing that the relative dominance of cancers, among diseases displaying TAD-border enrichment, is enhanced when considering only intergenic daSNPs
Fig. 3
Fig. 3
Preferential location of daSNPs in TAD borders occurs relatively more often for cancers. A comparison of the fraction of cancers (orange) and fraction of non-cancer diseases (blue) displaying TAD-border enrichment is presented for various filters on the GWAS catalog: considering all daSNPs, then only exonic, intronic, or intergenic daSNPs. Only diseases displaying enrichment for a majority of values of the TopDom window-size parameter k are included (see Methods). A Aggregation over five cell types (six datasets) using data at 10 kb-resolution from [30]. BG Detailed comparison for each of the six datasets, where the cell type is indicated above each panel, together with the restriction enzyme (DpnII or MboI) used in the Hi-C experiment. Stars indicate when the difference between cancers and non-cancer diseases is statistically significant (Fisher exact test, *p 0.05, **p 0.01, ****p 0.0001). Analyses for data from [, –38] are presented in Additional file 1: Fig. S7
Fig. 4
Fig. 4
Comparison of enrichment results for IMR90 cells and embryonic cells. Same as Fig. 3, now comparing three datasets for IMR90 cell line (left column, panels CD and E) with two datasets for embryonic stem cells (hESC, panels F and G). The two panels at the top display the results aggregated over the individual datasets, for IMR90 cells (pink background, panel A) and embryonic stem cells (blue background, panel B), respectively
Fig. 5
Fig. 5
Clustering of cancers and non-cancer diseases sharing daSNPs. In this network representation, red (resp. violet) nodes correspond to cancers (resp. non-cancer diseases) displaying TAD-border enrichment in daSNPs, orange (resp. light blue) nodes to other cancers (resp. other non-cancer diseases). An edge is drawn between two diseases when they share an associated SNP A located in a TAD border for a majority of values of the window parameter k (border SNPs, see Methods) or B belonging to the complementary set (non-border SNPs). C, D Same as (A, B) when considering only intergenic SNPs. The four networks have been visualized using NetworkX Python package. Underlying Hi-C data from [30], IMR90 cell type, 10 kb-resolution

Similar articles

Cited by

References

    1. Hardy J, Singleton A. Genome wide association studies and human disease. New Engl J Med. 2009;360:1759–1768. - PMC - PubMed
    1. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101:5–22. - PMC - PubMed
    1. Erichsen HC, Chanock SJ. SNPs in cancer research and treatment. Brit J Cancer. 2004;90:747–751. - PMC - PubMed
    1. McKay JD, Hung R, Han Y, Zong X, Carreras-Torres R, Christiani D, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet. 2017;49:1126–1132. - PMC - PubMed
    1. Maurano MT, Humbert R, Thurman RE, Haugen E, Wang H, Reynolds AP, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. - PMC - PubMed

Publication types