Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 19:12:RP88895.
doi: 10.7554/eLife.88895.

Transcriptional immune suppression and up-regulation of double-stranded DNA damage and repair repertoires in ecDNA-containing tumors

Affiliations

Transcriptional immune suppression and up-regulation of double-stranded DNA damage and repair repertoires in ecDNA-containing tumors

Miin S Lin et al. Elife. .

Abstract

Extrachromosomal DNA is a common cause of oncogene amplification in cancer. The non-chromosomal inheritance of ecDNA enables tumors to rapidly evolve, contributing to treatment resistance and poor outcome for patients. The transcriptional context in which ecDNAs arise and progress, including chromosomally-driven transcription, is incompletely understood. We examined gene expression patterns of 870 tumors of varied histological types, to identify transcriptional correlates of ecDNA. Here, we show that ecDNA-containing tumors impact four major biological processes. Specifically, ecDNA-containing tumors up-regulate DNA damage and repair, cell cycle control, and mitotic processes, but down-regulate global immune regulation pathways. Taken together, these results suggest profound alterations in gene regulation in ecDNA-containing tumors, shedding light on molecular processes that give rise to their development and progression.

Keywords: cancer; chromosomes; ecDNA; extrachromosomal DNA; gene expression; human; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

ML, SJ No competing interests declared, JL J.L. receives compensation as a consultant for Boundless Bio, HC Reviewing editor, eLife, SW S. Wu is a member of the scientific advisory board of Dimension Genomics Inc, PM P.S.M. is a co-founder and advisor of Boundless Bio. J.L. receives compensation as a consultant for Boundless Bio, VB V.B. is a co-founder, paid consultant, SAB member and has equity interest in Boundless Bio, Inc and Abterra Biosciences, Inc

Figures

Figure 1.
Figure 1.. Genes predictive of extrachromosomal DNA (ecDNA) status.
(A) The feature selection algorithm, Boruta, was applied to 200 datasets of randomly selected subsets consisting of 80% of all samples. Genes selected by Boruta in at least 10 of the 200 trials were identified as the Core set of genes (408) that were predictive of ecDNA presence. (B) Identification of highly co-expressed and stable gene clusters using pvclust expanded the Core set by an additional 235 genes to the final list of 643 CorEx genes. (C) Out of 354 clusters, the majority (344) of clusters contained one or two Core genes. (D) Most clusters were small, with only seven clusters containing more than 10 genes.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Cartoon illustration of the rationale for Boruta analysis.
A differentially expressed gene (e.g. gene A) may have a higher log-fold change (LFC) value than that of a CorEx gene (e.g. gene B), but may not be persistently over- or under-expressed in ecDNA(+) samples. Therefore, no expression value cut-off can be used to separate ecDNA(+) samples from ecDNA(-) samples using gene A expression. In contrast, gene B expression can separate the two classes perfectly.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Extended co-expressed cluster characterization.
Pvclust identified 354 highly co-expressed clusters where members are selected in at least 10 out of 200 Boruta trials as a cluster. Of the 354 clusters, two did not contain any Core genes, 281 only contained Core genes, and 71 contained at least one Core and one co-expressed gene.
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Gene expression of Core versus co-expressed genes in clusters.
Cluster #74 contains a Core gene, RAE1, and a co-expressed gene, CSTF1. Cluster #3 contains nine Core genes and 12 co-expressed genes.
Figure 1—figure supplement 4.
Figure 1—figure supplement 4.. Impact of tumor purity on CorEx gene expression.
Of the 870 TCGA samples (234 ecDNA(+), 636 ecDNA(-)), 701 samples (174 ecDNA(+), 527 ecDNA(-)) were assigned a consensus measurement of purity estimation (CPE) by Aran et al., 2015. (A) ecDNA(-) samples have slightly lower purity than ecDNA(+) samples (MWU p-value 0.0036). (B) For samples with high tumor purity (CPE ≥0.8, n=287), the expression directionality of CorEx genes (i.e. based on the Mann-Whitney U test) was highly correlated with that of all samples (n=870). Among the 275 genes that had a higher expression in the ecDNA(+) samples of all samples, 243 genes also had a higher expression in the ecDNA(+) samples of high tumor purity samples (red dots). Among the 284 genes that had a lower expression in the ecDNA(+) samples of all samples, 181 genes also had a lower expression in the ecDNA(+) samples of high tumor purity samples (blue dots). Together, 424 of 559 (75.8%) CorEx genes with significant directionality were in agreement between all samples and samples with high tumor purity.
Figure 1—figure supplement 5.
Figure 1—figure supplement 5.. Gene Ontology (GO) biological processes enriched by up-regulated CorEx genes from eight selection criteria ranging from 5 to 200 of 200 Boruta trials.
Of the 187 GO terms (turquoise) that were enriched by 262 up-regulated CorEx genes using 10 out of 200 Boruta trials as the selection criteria, 93 terms (49.7%) were enriched for each cut-off criteria, and 155 terms (82.9%) were enriched in at least 5 of the 8 cut-off criteria (dark blue). A minimum degree of three was required when using the python UpSet plot function (upsetplot v0.8.0).
Figure 2.
Figure 2.. Validation of CorEx genes.
(A) Cross-validation experiments validating the predictive value of CorEx genes. Precision denotes the fraction of predicted samples that were truly ecDNA(+). Recall refers to the fraction of ecDNA(+) samples that were predicted correctly. (B) For precision windows of width 0.1 and a value of at least 0.5, recall values were plotted as boxplots. The interquartile ranges for CorEx and Core genes overlap, suggesting similar predictive power. CorEx genes have higher predictive rates compared to the top 643 differentially expressed genes based on logarithmic fold changes from a DESeq2 analysis (Top-|LFC| genes), 3012 significant genes selected from a generalized linear model (GLM), and 643 randomly selected genes. (C) CorEx genes were consistently up- or down-regulated in ecDNA(+) samples across tumor types, with the exception of Sarcoma (SARC). Approximately unbiased (AU) p-values from multiscale bootstrap resampling are shown at the dendrogram branches. (D) Of the 643 Top-|LFC| genes, 240 were up-regulated while 403 were down-regulated in ecDNA(+) samples. Of the CorEx genes, 325 were up-regulated while 318 were down-regulated. The absolute log-fold change (LFC) values of the Top-|LFC| gene set was significantly greater than that of the CorEx genes (p-value 1.83e-158). (E) The normalized gene expression values of the CorEx genes were significantly higher than that of the Top-|LFC| gene set (p-value <2e-308). ***p-value <0.001.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Gene expression of Top-|LFC| versus CorEx genes.
86 CorEx genes are part of the 643 Top-|LFC| gene set.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Extended validation of CorEx genes.
(A) Cross-validation experiments validating the predictive value of CorEx genes. Precision denotes the fraction of predicted samples that were truly ecDNA(+). Recall refers to the fraction of ecDNA(+) samples that were predicted correctly. CorEx genes have similar predictive value to Core genes. (B) CorEx genes have higher predictive rates compared to 643 randomly selected genes and the top 643 differentially expressed genes based on logarithmic fold changes from a DESeq2 analysis (Top-|LFC| genes).
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Differential expression of Core genes by tumor type.
Core genes are also consistently up- or down-regulated in ecDNA(+) samples across tumor types, with the exception of Sarcoma (SARC), similar to CorEx genes (Figure 2C). AU p-values from multiscale bootstrap resampling are shown at the dendrogram branches.
Figure 2—figure supplement 4.
Figure 2—figure supplement 4.. Up-regulation of CorEx genes located on amplicons.
(A) PNMT normalized gene expression in ecDNA(+) vs. ecDNA(-) samples. (B) ITLN1 normalized gene expression in ecDNA(+) vs. ecDNA(-) samples. In both cases, their up-regulation in ecDNA(+) samples is explained by their presence on a region with somatic, focal copy number amplification.
Figure 3.
Figure 3.. Up-regulated CorEx genes.
(A) Gene Ontology (GO) biological processes enriched in up-regulated genes were clustered into 11 broad categories. The horizontal barplot represents the number of GO biological processes belonging to each of the 11 broad categories, while the vertical barplot represents the number of broad categories that a specific GO biological process belongs to. (B) Genes up- or down-regulated in processes involved in major double-strand break (DSB) damage repair pathways. Many critical genes in the classical non-homologous end-joining (c-NHEJ) pathway were down-regulated in ecDNA(+) samples relative to ecDNA(-) samples.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Biological process categories enriched in up-regulated CorEx genes.
187 enriched Gene Ontology Biological Process (GOBP) terms represented by 169 up-regulated CorEx genes. The enriched biological processes cluster into 11 categories, which in turn can be grouped into cell-cycle regulation, Mitotic cell division, double-strand break DNA Damage response, and the HOX gene cluster.
Figure 4.
Figure 4.. Down-regulated CorEx genes.
(A) Gene Ontology (GO) biological processes enriched in down-regulated genes were clustered into seven broad categories. The horizontal barplot represents the number of GO biological processes belonging to each of the seven broad categories, while the vertical barplot represents the number of broad categories that a specific GO biological process belongs to. (B) Four of these categories map to steps in the cancer-immunity cycle. CorEx genes in three of the four categories were significantly down-regulated compared to all genes (Fisher’s exact test).
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Biological process categories enriched in down-regulated CorEx genes.
73 enriched GOBP terms represented by 119 down-regulated CorEx genes. The enriched biological processes can be clustered into 7 categories, 6 of which are related to immune response, suggesting a broad-based deregulation.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Tumor microenvironment (TME) subtypes in extrachromosomal DNA (ecDNA)-containing tumors.
TME sub-typing of The Cancer Genome Atlas (TCGA) samples classified as ecDNA(+) or ecDNA(-) (Amplicon Classifier ver. 0.4.9) based on the immune subtypes provided by Thorsson et al., 2018.
Figure 5.
Figure 5.. Mutational characteristics of extrachromosomal DNA (ecDNA)-containing tumors.
(A) Total mutation burden of ecDNA(+) and ecDNA(-) samples. ecDNA(+) samples have significantly higher mutation burden than the ecDNA(-) samples (p-value <0.0001, Mann Whitney test). (B) Odds ratios of differentially mutated genes in ecDNA(+) and ecDNA(-) (p-value <0.005). The size of the dot indicates whether the corresponding gene belongs to the Cancer Gene Census (CGC) or not (Non-CGC). Only TP53 and BRAF showed significance at the level of FDR <0.1 (Benjamini-Hochberg).
Figure 5—figure supplement 1.
Figure 5—figure supplement 1.. Mutation burden based on damaging mutations.
Mutation burden of ecDNA(+) and ecDNA(-) calculated only with damaging mutations (deleterious SNVs predicted with SIFT or PolyPhen2, frameshift INDELs).
Figure 5—figure supplement 2.
Figure 5—figure supplement 2.. Mutation burden by tumor type.
Only mutation burdens of glioblastoma (GBM), low-grade gliomas (LGG), and uterine corpus endometrial carcinoma (UCEC) were significantly different between ecDNA(+) and ecDNA(-).
Figure 5—figure supplement 3.
Figure 5—figure supplement 3.. Performance of XGBoost model.
Performance evaluation of the best XGBoost model selected by HyperOpt.
Figure 5—figure supplement 4.
Figure 5—figure supplement 4.. Unsupervised principal component analysis on gene mutations.
PCA result performed with a binary mutational matrix of all mutated genes.
Figure 5—figure supplement 5.
Figure 5—figure supplement 5.. Single base substitution (SBS) signatures in The Cancer Genome Atlas (TCGA) samples by extrachromosomal DNA (ecDNA) status.
The activities of SBS signatures 1–60 in ecDNA(+) (orange) and ecDNA(-) (blue) samples (A) measured across 1440 TCGA samples and (B) separated by tumor types with at least one ecDNA(+) sample each. Activity refers to the estimated relative contribution of a specific mutational signature, and is calculated as follows: (number of variants that are attributed to a specific signature in a group)/(total number of variants in a group).
Author response image 1.
Author response image 1.
Author response image 2.
Author response image 2.
Author response image 3.
Author response image 3.
Author response image 4.
Author response image 4.
Author response image 5.
Author response image 5.
Author response image 6.
Author response image 6.
Author response image 7.
Author response image 7.
Author response image 8.
Author response image 8.
Author response image 9.
Author response image 9.
Author response image 10.
Author response image 10.
Author response image 11.
Author response image 11.

Update of

Similar articles

Cited by

References

    1. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Current Protocols in Human Genetics. 2013;Chapter 7:Unit7. doi: 10.1002/0471142905.hg0720s76. - DOI - PMC - PubMed
    1. Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of tumour purity. Nature Communications. 2015;6:8971. doi: 10.1038/ncomms9971. - DOI - PMC - PubMed
    1. Bagaev A, Kotlov N, Nomie K, Svekolkin V, Gafurov A, Isaeva O, Osokin N, Kozlov I, Frenkel F, Gancharova O, Almog N, Tsiper M, Ataullakhanov R, Fowler N. Conserved pan-cancer microenvironment subtypes predict response to immunotherapy. Cancer Cell. 2021;39:845–865. doi: 10.1016/j.ccell.2021.04.014. - DOI - PubMed
    1. Bergstrom EN, Luebeck J, Petljak M, Khandekar A, Barnes M, Zhang T, Steele CD, Pillay N, Landi MT, Bafna V, Mischel PS, Harris RS, Alexandrov LB. Mapping clustered mutations in cancer reveals APOBEC3 mutagenesis of ecDNA. Nature. 2022;602:510–517. doi: 10.1038/s41586-022-04398-6. - DOI - PMC - PubMed
    1. Chang HHY, Pannunzio NR, Adachi N, Lieber MR. Non-homologous DNA end joining and alternative pathways to double-strand break repair. Nature Reviews. Molecular Cell Biology. 2017;18:495–506. doi: 10.1038/nrm.2017.48. - DOI - PMC - PubMed