Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 12;9(1):1028.
doi: 10.1038/s41467-018-03411-9.

Capture Hi-C identifies putative target genes at 33 breast cancer risk loci

Affiliations

Capture Hi-C identifies putative target genes at 33 breast cancer risk loci

Joseph S Baxter et al. Nat Commun. .

Abstract

Genome-wide association studies (GWAS) have identified approximately 100 breast cancer risk loci. Translating these findings into a greater understanding of the mechanisms that influence disease risk requires identification of the genes or non-coding RNAs that mediate these associations. Here, we use Capture Hi-C (CHi-C) to annotate 63 loci; we identify 110 putative target genes at 33 loci. To assess the support for these target genes in other data sources we test for associations between levels of expression and SNP genotype (eQTLs), disease-specific survival (DSS), and compare them with somatically mutated cancer genes. 22 putative target genes are eQTLs, 32 are associated with DSS and 14 are somatically mutated in breast, or other, cancers. Identifying the target genes at GWAS risk loci will lead to a greater understanding of the mechanisms that influence breast cancer risk and prognosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Cell-type-specificity of interaction peaks at 51 informative breast cancer risk loci. Bar charts showing (a) the number of loci at which there were 0, 1–9, 10–19, 20–49, 50–99, 100–249 or >250 interaction peaks and (b) the number of interaction peaks at which the distance between interacting fragments are <100, 100–199, 200–499, 500–999, 1,000–1,999, 2,000–3,999, ≥4,000 kb for each cell line analysed. c, d Venn diagrams illustrating the overlap between interacting fragments in (c) the four breast cancer cell lines and (d) the five breast cell lines. e Dendogram showing Jaccard dissimilarity scores (i.e. 1—similarity coefficient) for the four breast cancer libraries
Fig. 2
Fig. 2
Interaction peaks at 10q26.13 and 14q13.3 in T-47D, Bre80 and MDA-MB-231 cell lines. Interaction peaks are shown in a looping format; interaction peaks in ZR-75–1, BT-20 and GM06990 are not shown but are available online (Methods). Interaction peaks between two captured fragments are red, interaction peaks between one captured fragment and one non-captured fragment are blue. Intensity of individual interactions are proportional to -log2(PFDR). Capture regions are shown as black bars; data are aligned with genomic coordinates (hg19) and RefSeq genes. Target genes (i.e. the subset at which an interaction peak co-localises with the TSS) are shown in red. The location of the published risk SNP is also shown. a 10q26.13-rs2981579 (FGFR2) locus. Interaction peaks originating from the capture region and co-localising with the FGFR2 TSS, interact with a region ~650 kb centromeric to the locus (highlighted in yellow) in T-47D and Bre80, but not MDA-MB-231. b Interaction peaks (shown in blue) at this region co-localise with DNase I hypersensitive sites, CTCF, p300, FOXA1, GATA3 and ERα ChIP-Seq peaks in T-47D cells. The orientation of CTCF peaks is indicated by the direction of the arrow. c 14q13.3-rs2236007 (PAX9) locus. Interaction peaks originating from the capture region and co-localising with the PAX9 TSS, interact with two regions ~300 and 500 kb telomeric to the locus (highlighted in yellow) in T-47D and Bre80, but not MDA-MB-231. Scale bar, 80 kb (d) and e Interaction peaks at these regions co-localise with DNase I hypersensitive sites, CTCF, FOXA1 and GATA3 ChIP-Seq peaks in T-47D cells. The orientation of CTCF peaks is indicated by the direction of the arrow
Fig. 3
Fig. 3
Interaction peaks at 11p15.5 and 6q25.1 in T-47D, Bre80 and MDA-MB-231 cell lines. Interaction peaks and genomic features are as described in Fig. 2. a 11p15.5-rs3817198 (LSP1) locus. At this locus, which is associated with ER+ disease there are multiple interaction peaks targeting KRTAP5-5 (~300 kb centromeric), LSP1 (within the capture region) and IGF2 (~ 250 kb telomeric) in the ER− cell line MDA-MB-231 but just a single IP targeting KRTAP5-5 in the ER+ cell line T-47D and none in Bre80. b 6q25.1-rs2046210 (ESR1) locus. At this locus, which is associated with predominantly ER− disease, there are multiple interaction peaks originating from the capture region, overlapping the ESR1 promoter in the ER+ breast cancer cell line T-47D, but not in the ER− cells breast cancer cell line MDA-MB-231
Fig. 4
Fig. 4
Interaction peaks at 11q13.1 and 11q13.3 in T-47D, Bre80 and MDA-MB-231 cell lines. Interaction peaks and genomic features are as described in Figs 2 and 3, with the exception that only target genes that are eQTLs are in red. In addition to the local within capture and in cis interaction peaks at each of these loci there are long-range (>4 Mb) interaction peaks between the two risk loci in both Bre80 and MDA-MB-231 cell lines. Target genes that are eQTLs for the 11q13.1 risk SNP (rs3903072) are CFL1, CTSW, KAT5, MUS81, SNX32, CCND1 and FADD. Three interactions are omitted for clarity (see Methods)
Fig. 5
Fig. 5
Box plots of gene expression according to genotype and Kaplan–Meier plots of disease-specific survival according to levels of expression for FADD (11q13), CDCA7 (2q31.1), ZFP36L1 (14q24.1) and MRPL34 (19p13.1). a Levels of expression of FADD are associated with 11q13.1-rs3903072 genotype in all cancers (P = 0.04) and ER+ cancers (P = 0.01); b in ER+ cancers, levels of expression of FADD are also associated with disease-specific survival (DSS) (c) excluding samples with copy-number gains strengthened the eQTL association in ER+ cancers (P = 0.004) (d) but attenuated the association with DSS. e, g, i Levels of expression of CDCA7, ZFP36L1 and MRPL34 are associated with 2q31.1-rs1550623 genotype in all cancers (P = 0.007), 14q24.1-rs2588809 genotype in ER+ cancers (P = 0.004) and 19p13.1-rs8170 in all cancers (P = 0.001) and ER+ cancers (P = 0.01), respectively. f, h, j In ER+ cancers, levels of expression of all three genes are associated with DSS

References

    1. DeSantis C, Ma J, Bryan L, Jemal A. Breast cancer statistics, 2013. CA Cancer J. Clin. 2014;64:52–62. doi: 10.3322/caac.21203. - DOI - PubMed
    1. Michailidou K, et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat. Genet. 2015;47:373–380. doi: 10.1038/ng.3242. - DOI - PMC - PubMed
    1. Hindorff LA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA. 2009;106:9362–9367. doi: 10.1073/pnas.0903103106. - DOI - PMC - PubMed
    1. Freedman ML, et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nat. Genet. 2011;43:513–518. doi: 10.1038/ng.840. - DOI - PMC - PubMed
    1. Edwards SL, Beesley J, French JD, Dunning AM. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet. 2013;93:779–797. doi: 10.1016/j.ajhg.2013.10.012. - DOI - PMC - PubMed

Publication types