Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Nov;27(11):1914-1924.
doi: 10.1038/s41556-025-01788-6. Epub 2025 Oct 21.

Enhancer activation from transposable elements in extrachromosomal DNA

Affiliations

Enhancer activation from transposable elements in extrachromosomal DNA

Katerina Kraft et al. Nat Cell Biol. 2025 Nov.

Abstract

Extrachromosomal DNA (ecDNA) drives oncogene amplification and intratumoural heterogeneity in aggressive cancers. While transposable element reactivation is common in cancer, its role on ecDNA remains unexplored. Here we map the 3D architecture of MYC-amplified ecDNA in colorectal cancer cells and identify 68 ecDNA-interacting elements-genomic loci enriched for transposable elements that are frequently integrated onto ecDNA. We focus on an L1M4a1#LINE/L1 fragment co-amplified with MYC, which functions only in the ecDNA-amplified context. Using CRISPR-CATCH, CRISPR interference and reporter assays, we confirm its presence on ecDNA, enhancer activity and essentiality for cancer cell fitness. These findings reveal that repetitive elements can be reactivated and co-opted as functional rather than inactive sequences on ecDNA, potentially driving oncogene expression and tumour evolution. Our study uncovers a mechanism by which ecDNA harnesses repetitive elements to shape cancer phenotypes, with implications for diagnosis and therapy.

PubMed Disclaimer

Conflict of interest statement

Competing interests: H.Y.C. is a cofounder of Accent Therapeutics, Boundless Bio, Cartography Biosciences and Orbital Therapeutics; he was an advisor of 10x Genomics, Arsenal Biosciences, Chroma Medicine and Spring Discovery until 15 December 2024. H.Y.C. is an employee and stockholder of Amgen as of 16 December 2024. M.G.J. is a consultant and holds equity in Tahoe Therapeutics. P.S.M. is a co-founder and advisor of Boundless Bio. J.D.B. is a founder and director of CDI Labs, Inc.; a founder of and consultant to Opentrons LabWorks/Neochromosome, Inc.; and serves or served on the scientific advisory boards of the following: CZ Biohub New York, LLC; Logomix, Inc.; Modern Meadow, Inc.; Rome Therapeutics, Inc.; Sangamo, Inc.; Tessera Therapeutics, Inc.; and the Wyss Institute. V.B. is a cofounder, serves on the scientific advisory board of Boundless Bio and Abterra and holds equity in both companies. Q.S. is an employee and stockholder of Amgen as of 20 February 2025. K.K. is an employee and stockholder of Amgen as of 15 September 2025. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Identification of EIEs.
a, Schematic of the Hi-C method performed in the ecDNA-containing COLO320DM cell line. b, Identification of EIEs. Sixty-eight individual EIEs were manually annotated across all chromosomes based on the interaction across the entirety of the MYC-amplified region of chromosome 8. The visualization represents the ecDNA from chromosome 8, with three examples of EIEs localized on other chromosomes. c, An example of a specific interaction, EIE 14 on chromosome 3, is enlarged, and associated genes are shown for both loci. The arrow and purple hexagon indicate EIE. d, Overlap fraction between EIE sequences and annotated LINE, SINE and LTR elements as reported in RepBase. EIEs are clustered according to similarity in overlap fraction across these three classes of repetitive elements. e, Pipeline for using Oxford Nanopore ultralong-read sequencing to identify the overlap of ecDNA genomic intervals and EIE-containing reads. f, The number of reads that contain a particular EIE and overlap with an ecDNA interval in the COLO320DM cell line. Counts are reported as log10(1 + x). Average genome coverage (approximately 12.1) is represented as a dashed red line. g, Reconstruction of the ecDNA breakpoint graph for COLO320DM from Oxford Nanopore ultralong-read data using the CoRAL algorithm. The EIE 14 region is highlighted in red, and the breakpoint indicating its translocation to the amplified chromosome 8 locus is annotated. Source data
Fig. 2
Fig. 2. CRISPR-CATCH elucidates ecDNA composition and EIE insertions.
a, A schematic diagram illustrating the CRISPR-CATCH experiment designed to isolate and characterize ecDNA components. The process involves the use of guide RNA targeting the EIE 14 from chromosome 3. DNA is embedded in agarose, followed by PFGE, allowing band extraction and subsequent next-generation sequencing (NGS) of ecDNA fragments. Negative control (NC) is a guide RNA not targeting ecDNA. b, The PFGE gel image displays the separation of DNA fragments, including lanes for the left ladder, ladder, empty lane, negative control, sgRNA #1, sgRNA #2, and band numbers corresponding to those analysed by NGS in in c and d. Targeting EIE 14 with guide RNAs leads to cleavage of the ecDNA chromosome 8 sequences, resulting in multiple discrete bands and confirming the insertion of EIE 14 onto ecDNA. sgRNA #1 ATATAGGACAGTATCAAGTA; sgRNA #2 TATATTATTAGTCTGCTGAA; full EIE 14 sequences from long-read sequencing are presented in Supplementary Table 6. c, Whole-genome sequencing results confirm the presence of EIE 14, originally annotated on chromosome 3, within the ecDNA, between the CASC8 and CASC11 genes, approximately 200 kilobases upstream from MYC. The dotted line indicates the position of this insertion. Each band is an ecDNA molecule of a different size that contains the EIE 14 insertion. d, Additional EIEs identified in the initial Hi-C screen, captured and sequenced in the CRISPR-CATCH gel bands from b. Each EIE is represented by a vertical shaded box with genomic coordinates, indicating insertion events within the ecDNA. e, ORCA visualization of the COLO320DM cell nucleus. The maximum-projected images show the spatial arrangement of the MYC oncogene, EIE 14 and the PVT1 locus, labelled in different colours for two different cells. The leftmost panel is an overlay of all images registered to nanometre precision (Methods). Scale bars, 5 μm. The Chr3 probe maps to the breakpoints of the EIE 14 origin inside CD96 intron. Source data
Fig. 3
Fig. 3. EIE 14 spatially clusters with MYC.
a, x,y,z projections of MYC exon (purple), PVT1 (blue) and EIE 14 (pink). b, Endogenous coordinates of all three measured genomic regions. c, Single-cell projection of the 3D fitted points from a. Scale bar, 2 μm. d, Pairwise distances between MYC (purple), PVT1 (blue) and EIE 14 (pink) of a single cell. Number of fitted points per genomic region n = 60, n = 43 and n = 25, respectively. e, Histogram showing the distribution of observed shortest pairwise distances between EIE 14 signals, compared with the expected shortest pairwise distances from randomly simulated points within a sphere (two-tailed Wilcoxon ranksum P < 1 × 10−10) of n = 1,329 analysed cells across two biological replicates. f, As in e but for MYC-to-MYC shortest pairwise distances (two-tailed Wilcoxon ranksum P < 1× 10−10). g, Schematic of Ripley’s K function to describe clustering behaviours over different nucleus volumes. Top: the nucleus divided into different shell intervals and how the K value is plotted for increasing radius (r). Bottom: an example of what clustered K(r) > 1 versus random K(r)  1 points could look like. K values greater than one indicate clustering behaviour relative to a random distribution over that given distance interval (r), K values of ~1 denote random distribution, while K values less than 1 indicate dispersion behaviour. h, The average K(r) value across distance intervals of 0.01–0.5 μm in 0.02-μm step sizes to describe the clustering relationship of PVT1 and EIE 14 relative to MYC across different distance intervals (μm). Error bars denote s.e.m. (two-tailed Wilcoxon ranksum P = 0.01442). Source data
Fig. 4
Fig. 4. EIE 14 is important for cell proliferation and has enhancer signatures.
a, Schematic of the CRISPRi screening strategy used to evaluate the regulatory potential of the 68 EIEs by designing 4–6 gRNAs per element for a total of 257 genomic regions tested and 125 non-targeting control sgRNAs. The screen involved the transduction of cells with a lentivirus expressing dCas9-KRAB and the sgRNAs such that each cell received one sgRNA, followed by calculation of cell growth phenotype over a series of timepoints (baseline (4 days), baseline + 3 days, baseline + 14 days and baseline + 1 month). The screen was further filtered on guide specificity (Methods), and 36/68 targeted EIEs met the qualifying threshold. b, The growth phenotype of COLO320DM cells 2 weeks post-transduction, relative to NTC. Each point represents the average guide effect (Z score) for sgRNAs targeting the 36 qualifying EIEs, ranked by their impact on cell growth. EIE 14 is indicated by dashed rectangle with negative Z score <−1 (significant negative impact on cell viability). See Extended Data Fig. 4 for additional timepoints. Positive hits are labelled in pink with their corresponding EIE. c, UCSC Genome Browser multiregion view showing the locations of the EIEs within the genome. Each EIE is indicated by a vertical bar. The browser displays the annotations for genes and repetitive elements such as Alu, LINE and LTR elements (RepeatMasker); the ATAC-seq dataset is normalized for library size (Methods). d, Zoom-in of EIE 14’s histone marks: enrichment of H3K27 acetylation, BRD4 binding and ATAC-seq peaks. ChIP-seq data were normalized to input to control for copy number. ATAC-seq data were normalized to library size (Methods). e, H3K9me3 histone modification of EIE 14 across ENCODE cell lines,.
Fig. 5
Fig. 5. ecDNA context is critical for EIE 14 enhancer activity.
a, Schematic outlining the COLO320DM cell line as having high copy number and high ecDNA levels, versus the HSR cell line, which has high copy number but low ecDNA. b, RNA-FISH labelling for EIE 14 and MYC exon 2 transcription in COLO320 DM and HSR. Median transcripts for EIE 14 are 4 and 0 for the DM and HSR cells (two-tailed Wilcoxon ranksum P = 8.22 × 10−94), respectively. DM cells have a median of 14 MYC transcripts, and HSR cells have a median of 8 transcripts per cell (two-tailed Wilcoxon ranksum P = 2.18 × 10−66). n = 712 cells (DM) n = 681 (HSR) across two biological replicates. Scale bars, 8 μm. c, Luciferase enhancer assay schematics and fold change in luciferase signal driven by either MYC or TK promoter normalized to promoter-only construct. n = 4 biological replicates. EIE 14 compared with positive control (PVT1 positive control from ref. ). P values obtained from two-tailed unpaired t-test. Error bars are standard deviations from the mean. d, Schematic outlining EIE 14 as a translocation event in healthy patients where EIE 14 is normally inactive across annotated cell lines (see a). EIE 14 gains regulatory potential when it is amplified within ecDNA as a consequence of translocation near MYC. EIE 14 can then act as a regulator of MYC in both cis and trans contacts within and between ecDNAs. Source data
Extended Data Fig. 1
Extended Data Fig. 1. Repetitive-element context, structural variations, and sequence composition of EIE 14.
a. Overlap of each EIE with the annotated genomic coordinates of LINE, SINE, or LTR elements. The background genome average of each class of repetitive element is reported as a solid black line. b. The graph (top) demonstrates the number of structural variations called in stripe alignments. Relationship between structural variations and read count for each element (bottom). Pearson correlation is 0.61. c. Schematics of ecDNA harboring 1.7 kb sequence obtained from long-read analysis of EIE 14. The region spanning 6-710 bp shows alignments with 3’ end of the LINE-1 element (L1PA2), whereas the region from 711-1690 bp is notably unique to intron 2 of the CD96 locus on chromosome 3 (L1M4a1). The L1M4a1-like segment harbors a polyA-signal–like motif (AAAAAG). d. Top panel, alignment of predicted protein from 6-710 bp with LINE-1 ORF2 (L1PA2). Bottom panel, amino acids alignment of LINE-1 ORF2 (L1PA2) and 6-710 bp coding protein by clustalW. Source numerical data in source data.
Extended Data Fig. 2
Extended Data Fig. 2. Long‑read DNA sequencing evidence for EIE 14 insertion and genomic context.
a. Screenshot of the IGV viewer with selected long reads depicting insertion sizes in purple. b. Long-read pileup across chr3 and chr8 demonstrating EIE 14 translocation to chr8 ecDNA locus. c. Sequence alignment of the T2T genome.
Extended Data Fig. 3
Extended Data Fig. 3. Copy‑number and spatial relationships between MYC, PVT1 and EIE 14.
a. EIE 14 DNA-FISH labeling of K562 cells lacking ecDNA and amplification of chr8. Arrow points to the FISH signal. The experiment was performed twice with similar results. b. Quantification of copy number of MYC, PVT1 and EIE 14 across all measured cells (n = 1329, 2 biological replicates). Mean copy number of MYC is 29 copies per cell, PVT1 is 31 copies per cell and EIE 14 is 22 copies per cell. Copies for all species ranged from 0 to 150 copies. c. Correlation plots between the loci per cell. Pearson’s correlation coefficient calculated for PVT1-MYC r = 0.82, EIE 14-MYC r = 0.71, EIE 14-PVT1 r = 0.74. d. Violin plots of shortest distances of MYC to PVT1 and EIE 14 (median distance denoted by red line). Red line denotes median distance. e. Histogram of shortest distances of MYC to PVT1 (blue) and MYC to EIE 14 (orange) (Wilcoxon two-sided ranksum p = 1.23e^-05). Source numerical data and images are available in source data. Source data
Extended Data Fig. 4
Extended Data Fig. 4. CRISPRi screen of EIEs in COLO320DM for different timepoints.
a. Schematic of the CRISPRi screening strategy used to evaluate the regulatory potential of the 68 EIEs by designing 4-6 gRNAs per element for a total of 257 genomic regions tested and 125 non-targeting control sgRNAs. The screen involved the transduction of cells with a lentivirus expressing dCas9-KRAB and the sgRNAs such that each cell received 1 sgRNA, followed by calculation of cell growth phenotype over a series of time points (Baseline(4 days), Baseline + 3 days, Baseline + 14 days, and Baseline + 1 month). The screen was further filtered on guide specificity (methods) and 36/68 targeted EIEs met the qualifying threshold. b. The growth phenotype of COLO320DM cells and reproducibility of counts between two biological replicates at different timepoints for the 36 qualifying EIEs. Each point represents the average guide effect (Z-score). Pearson correlations (r) are reported for reproducibility plots. c. Growth phenotypes of the qualifying EIE-targeting guides in COLO320DM cells across multiple time points. Each point represents the average guide effect (Z-score) for sgRNAs targeting a specific EIE. Guides with an average abs(Z-score) > 1 are annotated.
Extended Data Fig. 5
Extended Data Fig. 5. ATAC‑seq signal of EIEs across cell models.
ATAC-seq data for COLO320DM and SNU16 was obtained from Hung, Yost et al. and for PC3 and GBM39KT from Wu et al.. Previously, ATAC-seq data was mapped to hg19. ATAC-seq data was further normalized by library size.
Extended Data Fig. 6
Extended Data Fig. 6. RNA‑FISH quantification and enhancer reporter activity for EIE 14.
a. Quantification of RNA-FISH signal on a per cell basis from COLO320-DM cells labeling MYC exon 2 and EIE 14 (see Methods for quantification method). n = 712 cells across 2 biological replicates. b. As in (a) but for COLO320-HSR cells. n = 681 cells across 2 biological replicates. c. Fold change in luciferase signal driven by the L1PA2 (part 1), L1M4a1 (part2) and combined (EIE14) regions of EIE 14 for n = 3 biological replicates. P-value MYC promoter (ctrl) vs part 1 p = 0.0006; p-value MYC promoter (ctrl) vs part 2 p = 0.0015; value MYC promoter (ctrl) vs part 1+part2 (EIE14) p = 0.0001 (p-values obtained from two-tailed unpaired t-test). Error bars are stanford deviations from the mean. Source numerical data and images are available in source data.

Update of

References

    1. Wahl, G. M. The importance of circular DNA in mammalian gene amplification. Cancer Res.49, 1333–1340 (1989). - PubMed
    1. Benner, S. E. Double minute chromosomes and homogeneously staining regions in tumors taken directly from patients versus in human tumor cell lines. Anticancer Drugs2, 11–25 (1991). - DOI - PubMed
    1. Kim, H. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nat. Genet.52, 891–897 (2020). - DOI - PMC - PubMed
    1. Yan, X. Extrachromosomal DNA in cancer. Nat. Rev. Cancer24, 261–273 (2024). - DOI - PubMed
    1. Chamorro González, R. et al. Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells. Nat. Genet.55, 880–890 (2023). - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources