Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 5;111(12):2773-2788.
doi: 10.1016/j.ajhg.2024.10.005. Epub 2024 Nov 3.

Genomes and epigenomes of matched normal and tumor breast tissue reveal diverse evolutionary trajectories and tumor-host interactions

Affiliations

Genomes and epigenomes of matched normal and tumor breast tissue reveal diverse evolutionary trajectories and tumor-host interactions

Bin Zhu et al. Am J Hum Genet. .

Abstract

Normal tissues adjacent to the tumor (NATs) may harbor early breast carcinogenesis events driven by field cancerization. Although previous studies have characterized copy-number (CN) and transcriptomic alterations, the evolutionary history of NATs in breast cancer (BC) remains poorly characterized. Utilizing whole-genome sequencing (WGS), methylation profiling, and RNA sequencing (RNA-seq), we analyzed paired germline, NATs, and tumor samples from 43 individuals with BC in Hong Kong (HK). We found that single-nucleotide variants (SNVs) were common in NATs, with one-third of NAT samples exhibiting SNVs in driver genes, many of which were present in paired tumor samples. The most frequently mutated genes in both tumor and NAT samples were PIK3CA, TP53, GATA3, and AKT1. In contrast, large-scale aberrations such as somatic CN alterations (SCNAs) and structural variants (SVs) were rarely detected in NAT samples. We generated phylogenetic trees to investigate the evolutionary history of paired NAT and tumor samples. They could be categorized into tumor only, shared, and multiple-tree groups, the last of which is concordant with non-genetic field cancerization. These groups exhibited distinct genomic and epigenomic characteristics in both NAT and tumor samples. Specifically, NAT samples in the shared-tree group showed higher number of mutations, while NAT samples belonging to the multiple-tree group showed a less inflammatory tumor microenvironment (TME), characterized by a higher proportion of regulatory T cells (Tregs) and lower presence of CD14 cell populations. In summary, our findings highlight the diverse evolutionary history in BC NAT/tumor pairs and the impact of field cancerization and TME in shaping the genomic evolutionary history of tumors.

Keywords: Chinese; breast cancer; cancer genomics; clonal evolution; normal tissues adjacent to the tumor; omics analyses; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests Authors have no competing interests to disclose.

Figures

None
Graphical abstract
Figure 1
Figure 1
Genomic landscape of breast tumor and paired normal adjacent to tumor samples from 43 individuals with breast cancer in the Hong Kong Breast Cancer Study (A) Study design overview. This figure illustrates the study’s workflow, where paired germline, NAT, and tumor samples were collected for comprehensive multi-omics profiling. This included whole-genome sequencing (WGS), RNA sequencing (RNA-seq), and methylation profiling. Additionally, tissue scans underwent review by pathologists to validate the classification of tumor and normal tissues. (B) Number of samples included for different omics platforms. (C) Stacked bar plot displays the frequencies of mutated breast cancer driver genes in tumor and NAT samples among 43 individuals, with colors representing the functional impact of the mutations. (D) The oncoplot visualizes the prevalence of point mutations within breast cancer driver genes across tumor and NAT samples. The top details mutation frequencies across the samples. The bottom bars categorize the samples according to PAM50 tumor subtypes and phylogenetic tree groups, respectively. Colored blocks overlaid on the plot denote cases where a sample exhibits two different mutations within the same gene.
Figure 2
Figure 2
Genomic and epigenomic comparison between paired breast tumor and normal adjacent to tumor samples in 43 Hong Kong individuals with breast cancer (A) Tumor mutational burden (TMB), illustrated as the number of mutations per megabase (MB) on a logarithmic scale. TMB is calculated as the ratio of the number of synonymous and nonsynonymous single nucleotide variants detected by whole-genome sequencing (WGS) to the size of the whole genome. (B) Telomere length (TL) estimated from the WGS data and measured in kilobases (Kbs). (C) Proportions of COSMIC single-base substitution (SBS) signatures in tumor and NAT samples. (D) Proportions of COSMIC double-base substitution (DBS) signatures in tumor samples. (E) Proportions of COSMIC insertion-deletion (ID) signatures in tumor samples. (F) Proportions of relative scores of immune cells estimated by MethylCIBERSORT. Significance levels are denoted as follows: ∗∗∗p < 0.001, ∗∗p < 0.01, p < 0.05.
Figure 3
Figure 3
Analysis of somatic copy-number alterations and structural variants in paired breast tumor and normal adjacent to tumor samples in 43 Hong Kong individuals with breast cancer (A) The somatic copy-number alteration (SCNA) profiles for both tumor and NAT samples along chromosomes. The relative copy-number profile is determined by subtracting the ploidy (nWGD = 2, WGD = 4) from the total copy number. Annotations along the rows include tumor purity, WGD status, PAM50 classification, and sample type, detailed on the right side. The top highlights SCNA frequencies, categorizing them into amplifications, deletions, and copy-neutral loss of heterozygosity (LOH), represented by a black line. WGD indicates samples with whole-genome doubling, while nWGD denotes those without. (B) The frequency of SCNA signatures in tumor samples, with corresponding frequencies in NAT samples detailed in Figure S6. (C) Number and type of SV events observed in NAT and tumor samples, respectively. (D) Circos plot of SV events in NAT samples. (E) Frequency and type of SV signatures in tumor samples.
Figure 4
Figure 4
Representative phylogenetic structures identified in NAT-tumor pair samples Different colored lines are used to signify the presence of clones or subclones in paired NAT and tumor samples. The color scheme includes green for shared clones, orange for NAT-exclusive clones, and blue for tumor-exclusive clone or subclone membership. Circular plots below each tree visually represent the clonal and subclonal structures, along with the cancer cell fraction (CCF) values within each cluster. The first row within each oval plot provides an overview of the clone and subclone clusters’ overall structure. Subclonal clusters superimposed on one another indicate clonality membership, with the larger clusters representing subclones enveloping their nested counterparts. Subclones that are adjacent but not superimposed signify divergent subclones. The size of the clone or subclone ovals reflects the CCF value for each corresponding cluster.
Figure 5
Figure 5
Association of genomic and epigenomic characteristics with broad phylogenetic tree groups in NAT samples (A) Tumor mutation burden. (B) Telomere length (TL), measured in kilobases (Kbs). (C) Frequency of CD14+ cell fractions estimated by MethylCIBERSORT. (D) Promoter methylation of probes in REC8. (E) REC8 expression levels in NAT samples. Significance levels are denoted as follows: ∗∗∗p < 0.001, ∗∗p < 0.01, p < 0.05.
Figure 6
Figure 6
Association of genomic and epigenomic characteristics with broad phylogenetic tree groups in tumor samples (A) Boxplots of tumor mutation burden (TMB), percent genome affected by somatic copy number alterations (%PGA), structural variant counts, and the stacked bar plot of PAM50 tumor subtypes. Significance levels are denoted as follows: ∗∗∗p < 0.001, ∗∗p < 0.01, p < 0.05. (B) Heatmap of driver genes visually representing mutated genes identified within samples in each tree group. The numbers within each cell denote the frequency of a specific gene mutation in each tree group, while the cell colors signify the proportion of that gene’s mutation relative to all the mutated genes identified within that tree group. (C) Bubble heatmaps highlight the presence of signatures include single-base substitution (SBSs), doublet-base substitution (DBSs), small insertions and deletions (IDs), structural variation (SVs), and copy-number (CN) variations in each group, respectively. The size of each bubble in the plots represents the proportion of samples exhibiting a particular signature, while the color signifies the log10 value of the mean activity +1 of that signature across all samples within a tree group.
Figure 7
Figure 7
Field cancerization in breast cancer This figure illustrates the concept of field cancerization in breast cancer, showcasing the emergence of molecular alterations across various cell groups within breast tissue, which lead to multiple or shared-tree groups. The normal epithelium (top) consists of normal cells (depicted in light pink color) without detectable molecular changes. Due to mutational processes or changes in the cell microenvironment, morphologically normal cells with molecular alterations emerge, forming heterogeneous subclones (illustrated by dark brown and light brown cells in the middle left) or a homogeneous subclone (illustrated by light brown cells in the middle right). Some subclones (illustrated by light brown cells) continue to develop molecular alterations, leading to malignant cells (illustrated by purple cells). In the multiple-tree group (bottom left), subclones in NAT (e.g., dark brown cells) and tumors (e.g., light brown and purple cells) do not share genetic alterations. Conversely, in shared-tree groups (bottom right), subclones in NAT (e.g., light brown cells) and tumors (e.g., light brown and purple cells) share genetic alterations. The characteristics of NAT and tumor samples are listed for multiple and shared-tree groups, respectively.

Similar articles

Cited by

References

    1. Perou C.M., Sørlie T., Eisen M.B., van de Rijn M., Jeffrey S.S., Rees C.A., Pollack J.R., Ross D.T., Johnsen H., Akslen L.A., et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093. - DOI - PubMed
    1. Curtis C., Shah S.P., Chin S.F., Turashvili G., Rueda O.M., Dunning M.J., Speed D., Lynch A.G., Samarajiwa S., Yuan Y., et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346–352. doi: 10.1038/nature10983. - DOI - PMC - PubMed
    1. Ribelles N., Perez-Villa L., Jerez J.M., Pajares B., Vicioso L., Jimenez B., de Luque V., Franco L., Gallego E., Marquez A., et al. Pattern of recurrence of early breast cancer is different according to intrinsic subtype and proliferation index. Breast Cancer Res. 2013;15 doi: 10.1186/bcr3559. - DOI - PMC - PubMed
    1. Huang X., Stern D.F., Zhao H. Transcriptional Profiles from Paired Normal Samples Offer Complementary Information on Cancer Patient Survival--Evidence from TCGA Pan-Cancer Data. Sci. Rep. 2016;6 doi: 10.1038/srep20567. - DOI - PMC - PubMed
    1. Gadaleta E., Fourgoux P., Pirró S., Thorn G.J., Nelan R., Ironside A., Rajeeve V., Cutillas P.R., Lobley A.E., Wang J., et al. Characterization of four subtypes in morphologically normal tissue excised proximal and distal to breast cancer. NPJ Breast Cancer. 2020;6:38. doi: 10.1038/s41523-020-00182-9. - DOI - PMC - PubMed

MeSH terms