Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 17;26(1):212.
doi: 10.1186/s13059-025-03650-2.

Direct genetic transformation bypasses tumor-associated DNA methylation alterations

Affiliations

Direct genetic transformation bypasses tumor-associated DNA methylation alterations

Sara Hetzel et al. Genome Biol. .

Abstract

Background: Tumors represent dynamically evolving populations of mutant cells, and many advances have been made in understanding the biology of their progression. However, there are key unresolved questions about the conditions that support a cell's initial transformation, which cannot be easily captured in patient populations and are instead modeled using transgenic cellular or animal systems.

Results: Here, we use extensive patient atlas data to define common features of the tumor DNA methylation landscape as they compare to healthy human cells and apply this benchmark to evaluate 21 engineered human and mouse models for their ability to reproduce these patterns. Notably, we find that genetically induced cellular transformation rarely recapitulates the widespread de novo methylation of Polycomb regulated promoter sequences as found in clinical samples, but can trigger global changes in DNA methylation levels that are consistent with extensive proliferation in vitro.

Conclusions: Our results raise pertinent questions about the relationship between genetic and epigenetic aspects of tumorigenesis as well as provide an important molecular reference for evaluating existing and emerging tumor models.

Keywords: Cancer; DNA methylation; Disease models; Epigenetics; Genetically engineered mouse models.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Patients of the metastatic melanoma study were recruited in the Precision Oncology Program of the Charité Comprehensive Cancer Center (Berlin). Informed consent was obtained from all human subjects included in the study. The study was approved by the local Institutional Review Board of the Charité Universitätsmedizin Berlin (EA4/063/13, Charité Ethics Committee: Charitéplatz 1, 10117 Berlin, Germany). For other tumor types, genomic was obtained from OriGene where samples were collected under IRB approved protocols or publicly available data sets were used. Animal procedures have been approved by the IACUC of the respective organization (Massachusetts Institute of Technology, Harvard University, Beth Israel Deaconess Medical Center and Yale University). Consent for publication: Not applicable. Competing interests: A.M. and Z.D.S. are inventors on a patent related to hypermethylated CGI targets in cancer. Z.D.S. and A.M. are co-founders and scientific advisors of Harbinger Health. R.W. is currently an employee of Merck KGaA. A.R. is currently an employee of Genentech. M.Y. is CSO and shareholder of Alacris Theranostics GmbH. E.H. consults for and is a shareholder of Dyno Therapeutics. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Identification of pan-cancer CGI hypermethylation signatures. a Schematic of the general DNA methylation trends that distinguish primary tumors from somatic cells, including global loss and focal gain at CpG islands (CGIs). The extent to which different genetically engineered experimental models recapitulate this genome-scale transformation has not been systematically investigated and is the subject of this study. For this purpose, a benchmark of DNA methylation dynamics within and across tumor and healthy cell types is required in order to assess the robustness of the detection of tumor-specific signatures. Additional schematics are included to summarize the data sets collected or generated to relate clinical signatures to experimental models. b Number of hyper CGIs commonly found (≥ 50% of patients) for each of the 26 tumor types. Tumor types examined (see labels in Fig. 1f): T cell acute lymphoblastic leukemia (T-ALL), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), B cell acute lymphoblastic leukemia (B-ALL), esophageal carcinoma (ESCA), cholangiocarcinoma (CHOL), head and neck squamous cell carcinoma (HNSC), colon adenocarcinoma (COAD), prostate adenocarcinoma (PRAD), breast invasive carcinoma (BRCA), pancreatic adenocarcinoma (PAAD), glioblastoma (GBM), lung adenocarcinoma (LUAD), rectum adenocarcinoma (READ), bladder urothelial carcinoma (BLCA), uterine corpus endometrial carcinoma (UCEC), lung squamous cell carcinoma (LUSC), acute myeloid leukemia (LAML), skin cutaneous melanoma (SKCM), liver hepatocellular carcinoma (LIHC), stomach adenocarcinoma (STAD), kidney renal clear cell carcinoma (KIRC), sarcoma (SARC), pheochromocytoma and paraganglioma (PCPG), kidney renal papillary cell carcinoma (KIRP), thymoma (THYM), thyroid carcinoma (THCA). Blue labels indicate tumor types that are less prone to CGI hypermethylation. c Histogram showing the fraction of common hyper CGIs that are shared across an increasing number of different tumor types. CGIs that are called in at least 30% of single tumor type comparisons were assigned to the pan-cancer hyper CGI set (marked in green). d Boxplot showing the number of hyper CGIs that are called per patient, with the number of total hyper CGIs found in at least one patient or included in the “common” set highlighted (upper and lower line, respectively). Six tumor types show comparatively low per-sample hyper CGI signal. For these, the number of common hyper CGIs called using our approach is below the core distribution of hyper CGIs for single samples (≤ 25% quantile), suggesting that these cancer types may be less prone to systematic CGI hypermethylation as part of their pathobiology. Lines denote the median, edges denote the interquartile range (IQR), whiskers denote 1.5 × IQR and minima/maxima are represented by dots. e Saturation analysis of the number of hyper CGIs cumulatively observed from random samples of 100 patient tumors (see Methods). A large fraction of each tumor type's overall CGI methylation profile is already captured by a comparatively small number of patient samples (< 25). f Boxplot showing the median methylation of each common tumor type-specific hyper CGI set for healthy and tumor samples. Note that each comparison uses its own set of commonly hypermethylated CGIs as called for the indicated tumor type. Lines denote the median, edges denote the interquartile range (IQR), whiskers denote 1.5 × IQR and minima/maxima are represented by dots. g Overlap between PRC2 target, pan-cancer hyper and the union of tumor type-specific hyper CGIs
Fig. 2
Fig. 2
Cancer-associated CGI hypermethylation is not observed in most healthy cell types. a Boxplot displaying the fraction of each tumor type-specific hyper CGI set that is called as methylated across a panel of 46 purified human cell types. Each data point represents the overlap between the methylated CGIs within a given healthy cell type and each tumor type’s common hyper CGI set to highlight the degree to which healthy human cells carry cancer-like methylation patterns (denoted as boxplot or dots). Tumor types with overall low numbers of hyper CGIs are flagged as exceptions for reasons described in text. Additionally, three healthy cell types show a higher degree of cancer-associated CGI methylation, including memory B cells, colon and small intestinal epithelium. Memory B cells have been reported to show elevated CGI methylation levels [28, 29], while colon and small intestine epithelium (*) show higher levels but were obtained as part of tumor resections from cancer patients or from patients with other intestinal diseases, which confounds our ability to say that these cells definitively represent a true “healthy” sample [27]. Lines denote the median, edges denote the IQR and whiskers denote 1.5 × IQR. Eryth-prog = erythrocyte progenitor, Cardio = cardiomyocyte, Granul = granulocyte, Musc = muscle, Fibro = fibroblast, Mono = monocyte, Macro = macrophage, Ep = Epithelium, Alveo = alveolar, Osteob = osteoblast, NK = natural killer cell, Epid-Kerat = epidermal keratinocyte, Bron = bronchus, Oligodend = oligodendrocyte, Hep = hepatocyte, Small-Int = small intestine. b Comparison between tumor type-specific hyper CGIs and different control tissues. Left: Heatmap displaying the fraction of each tumor type’s original hyper CGI set (defined with the correct control) that are called when using any other healthy tissue as the control sample (median fraction recovered = 0.8). The diagonal is always 1, as the complete CGI set can be found when compared against the correct control tissue (x-axis = control tissue, y-axis = tumor). Right: Log2-transformed enrichment of hyper CGIs called compared to random sampling (0 reflects no difference to random sampling). Dots denote the median and the gray area denotes the IQR. c Genome browser track of the TAFA4 locus with exemplary whole genome bisulfite sequencing (WGBS) samples generated from healthy breast, lung and colon tissue as well as from primary tumors derived from these tissues. The tumor samples exhibit hypermethylation of the promoter CGI and loss of methylation in inter- and intragenic regions. d Scatterplot showing the median methylation and fraction of methylated CGIs for PRC2 target CGIs, as well as for the pan-cancer and tumor type-specific hyper CGI sets for healthy and tumor WGBS samples. e Summary of CGI methylation dynamics in healthy cells as they relate to tumors: Approximately one third of all CGIs are methylated in healthy tissues (1) and most of them overlap across sampled cell types (2). Tumor types methylate additional CGIs as part of their biology, which rarely overlap with signatures found in any healthy cell type (3). This finding is supported by the fact that a large fraction of tumor-associated hyper CGIs can be recapitulated when using different control tissues (4, left). The main exceptions to this are tumor types with overall low numbers of hyper CGIs (4, right). Ovals are not drawn in a quantitative manner, but to roughly reflect the size of CGI sets
Fig. 3
Fig. 3
Absence of CGI hypermethylation in in vitro models of melanoma. a Schematic of the melanoma model introduced by Hodis et al. (Ref. [35]). Wild type (WT) melanocytes were sequentially edited to generate nine different cell lines to model genetic drivers of melanoma. Edits included the commonly mutated genes CDKN2A (C), BRAF (B), TERT (T), TP53 (3), PTEN (P) and APC (A). Cultured models marked in green were additionally used to generate xenograft models. b Top: Principal component analysis (PCA) of healthy, tumor and engineered melanocyte model samples based on pan-cancer hyper CGI methylation. Samples of the melanoma model group closely to healthy tissues but not the majority of tumor samples. Bottom: Boxplot showing the distribution of healthy, engineered melanocyte model and tumor (split by melanoma and other) samples on the first principal component. Lines denote the median, edges denote the IQR, whiskers denote 1.5 × IQR and minima/maxima are represented by dots. c Distribution of the mean methylation of 1 kilobase (kb) tiles and commonly hypermethylated CGIs (SKCM) across the different melanoma models and replicates. Lines denote the IQR and dots denote the median
Fig. 4
Fig. 4
Melanoma patient signatures differ from experimentally transformed melanocytes. a Genome browser track of the OTX2 locus with exemplary WGBS samples of cultured wild type and mutated melanocytes, resulting xenografts as well as patient melanomas. While the tumor samples show different degrees of CGI hypermethylation, the melanocyte model samples maintain CGIs in an unmethylated state. b Barplot showing the number of hyper- and hypomethylated differentially methylated regions (DMRs) called between in vitro model, xenograft or patient and control samples. c Comparison of the number of CGIs that overlap hyper- or hypomethylated DMRs between control and model or patient samples, split according to distinct features (all, PRC2-regulated, pan-cancer, and defined for SKCM specifically). Only the patient samples display a detectable number of hypermethylated CGIs for any category. d Heatmap showing the DNA methylation landscape of chromosome 18q averaged across 100 kb tiles for wild type and knockout in vitro melanocytes, xenografts and melanoma patients. The annotation of highly and partially methylated domains (HMDs, PMDs) was defined according to the wild type melanocyte samples (see Methods). e Overlap between hyper- or hypomethylated DMRs of the three different experimental comparisons (in vitro model, xenografts or patients) against control. In order to compare the overlap, we merged DMR sets from all comparisons (separated by hyper and hypo) and calculated the overlap between individual sets. f Left: Heatmap and hierarchical clustering of the 5000 most variable CpGs across the melanoma cohort. Patient, wild type melanocytes and one mutated melanocyte sample (CB) cluster separately from the more hypomethylated model samples. Right: Distribution of most variable CpGs in CGI-related features and chromatin states (based on penis foreskin melanocytes). The majority of CpGs are found in quiescent, heterochromatic or Polycomb-repressed open water regions and rarely overlap features with clear regulatory functions. g Same as in f but based on the 5000 most variable CpGs within CGIs. Here, patient samples cluster separately from all model samples (wild type, knockout, xenograft) and CpGs are frequently found in bivalent domains
Fig. 5
Fig. 5
BJ fibroblast transformation models do not induce widespread CGI hypermethylation. a Schematic of the senescence and transformation model utilized by Xie et al. (Ref. [39]). BJ fibroblasts were transduced either sequentially with human telomerase catalytic subunit (TERT), the simian virus 40 large T antigen (SV40), and the H-Ras oncoprotein (HRAS, with potential subsequent implantation into a xenograft model) or with an empty vector (EV) that over time served as a model for senescence (Sen). Additionally, oncogene-induced senescence (OIS) was achieved by transfecting the cells directly with the H-Ras oncoprotein. b Boxplot showing the median methylation across pan-cancer hyper CGIs for samples of the BJ senescence and transformation model as well as for individual tumor types (450 k array, split by types prone to CGI hypermethylation and exceptions from Fig. 2). Neither transformation nor senescence models show notable CGI hypermethylation, while transformed cells that were implanted into immunocompromised mice exhibit a mild gain. Lines denote the median, edges denote the IQR, whiskers denote 1.5 × IQR and minima/maxima are represented by dots
Fig. 6
Fig. 6
CGI hypermethylation is rarely observed in experimental mouse models. a Heatmaps showing the median methylation of PRC2 target, pan-cancer and tumor type-specific hyper CGIs in human patients (top) and a broad selection of genetic, chemical or spontaneous mouse tumorigenesis models (bottom). PRC2 targets were defined separately based on human and mouse annotations; hyper CGI sets were defined by identifying similar CGIs between both species that overlapped the human hyper CGI sets (see Methods). While the patient samples show varying degrees of CGI hypermethylation, only T-ALL mouse models exhibit any notable gain of methylation at these loci. Human patients were profiled using WGBS, while publicly available and newly generated mouse models were profiled using different assays (WGBS: AOM/DSS; PRC2 target enrichment: KCO(het) and KCO; enhanced reduced representation bisulfite sequencing (ERRBS): PyMT, Tet2fl/fl/FLT3-ITD and Idh1(R132H)KI; RRBS: all others). Precursor refers to the following: Hyperplasia for PyMT; DSS treatment (colitis) for AOM/DSS; myelodysplastic syndrome or another pre-leukemic state for LAML; Pten knockout cells or cells overexpressing Lmo2 prior to actual disease state for T-ALL. b Boxplot showing the median methylation across tumor type-specific hyper CGIs for mouse models (black box) and human patients (gray box). Lines denote the median, edges denote the IQR, whiskers denote 1.5 × IQR and minima/maxima are represented by dots. c Scatterplot showing the median methylation and fraction of methylated CGIs of tumor type-specific hyper CGIs for healthy, precursor and tumor mouse model samples. T-ALL samples are marked with a cross. For comparison to the behavior of human tumors, see Fig. 2d. d Comparison of the fraction of CpGs methylated at different levels across healthy and tumor samples for mouse models with available healthy tissue. Fractions should only be compared between samples of the same model as deviations due to sequencing technology and sample preparation between different studies are expected

References

    1. Madakashira BP, Sadler KC. DNA Methylation, Nuclear Organization, and Cancer. Front Genet. 2017;8:76. 10.3389/fgene.2017.00076. - PMC - PubMed
    1. Jones PA, Baylin SB. The epigenomics of cancer. Cell. 2007;128:683–92. 10.1016/j.cell.2007.01.029 . - PMC - PubMed
    1. Feinberg AP. Phenotypic plasticity and the epigenetics of human disease. Nature. 2007;447:433–40. 10.1038/nature05919. - PubMed
    1. Esteller M. Epigenetic gene silencing in cancer: the DNA hypermethylome. Hum Mol Genet. 2007;16Spec No 1, R50–59. 10.1093/hmg/ddm018. - PubMed
    1. Widschwendter, M. et al. Epigenetic stem cell signature in cancer. Nat Genet. 2007;39:157–158. 10.1038/ng1941. - PubMed

LinkOut - more resources