Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 3;13(4):910-927.
doi: 10.1158/2159-8290.CD-22-0900.

Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration

Affiliations

Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration

Keiko Akagi et al. Cancer Discov. .

Abstract

The human papillomavirus (HPV) genome is integrated into host DNA in most HPV-positive cancers, but the consequences for chromosomal integrity are unknown. Continuous long-read sequencing of oropharyngeal cancers and cancer cell lines identified a previously undescribed form of structural variation, "heterocateny," characterized by diverse, interrelated, and repetitive patterns of concatemerized virus and host DNA segments within a cancer. Unique breakpoints shared across structural variants facilitated stepwise reconstruction of their evolution from a common molecular ancestor. This analysis revealed that virus and virus-host concatemers are unstable and, upon insertion into and excision from chromosomes, facilitate capture, amplification, and recombination of host DNA and chromosomal rearrangements. Evidence of heterocateny was detected in extrachromosomal and intrachromosomal DNA. These findings indicate that heterocateny is driven by the dynamic, aberrant replication and recombination of an oncogenic DNA virus, thereby extending known consequences of HPV integration to include promotion of intratumoral heterogeneity and clonal evolution.

Significance: Long-read sequencing of HPV-positive cancers revealed "heterocateny," a previously unreported form of genomic structural variation characterized by heterogeneous, interrelated, and repetitive genomic rearrangements within a tumor. Heterocateny is driven by unstable concatemerized HPV genomes, which facilitate capture, rearrangement, and amplification of host DNA, and promotes intratumoral heterogeneity and clonal evolution. See related commentary by McBride and White, p. 814. This article is highlighted in the In This Issue feature, p. 799.

PubMed Disclaimer

Figures

Figure 1. LR-seq reads containing only HPV sequences revealed frequent HPV concatemers with and without SVs in multiple cancers and cell lines. A–D, Shown are (top, y-axis) read count histograms and (bottom, y-axis) plots of the distance (Δ) between 5′ and 3′ mapped coordinates when HPV-only ONT reads were aligned against the HPV16 reference genome for (A) tumor 1, (B) tumor 2, (C) tumor 3, and (D) VU147 cell line. X-axis, top and bottom panels, ONT read lengths in kilobase pairs (kb); n, number of aligned ONT reads. Bottom, heat map, read counts. E, Schematic depicting distance Δ between read 5′ and 3′ ends (based on half-maximal genome unit circumference, 7,906 bp ÷ 2). Gray, top and bottom, two ONT reads aligned against (red) a one-unit circle of the HPV16 genome. F, Representative ONT reads from samples in A–D aligned against concatemeric HPV genomes. X-axis, dashed lines, ∼7.9-kb HPV genome unit length; black arrows, orientation of HPV genome from coordinates 1 to 7,906. G, Dot plots depict (light gray) alignments of (x-axis) representative ONT reads from VU147 cells of variable lengths against (y-axis, arrow) one ∼7.9-kb HPV genome unit. DUP, duplications; DEL, deletions; INV, inversions; colored circles, sites of discordant or split reads supporting a breakpoint. H, Virus–only VU147 ONT reads are shown as (top) block diagrams and (bottom) breakpoint plots, grouped by the presence of unique virus–virus breakpoints. Red lines, HPV genome (vertical black ticks, HPV reference coordinate 0; vertical white ticks, HPV rearrangement); colored dots, numbers, inset key, breakpoints; numbers below block diagrams, group-defining breakpoints. See also Supplementary Figs. S2.2 and S2.3.
Figure 1.
LR-seq reads containing only HPV sequences revealed frequent HPV concatemers with and without SVs in multiple cancers and cell lines. A–D, Shown are read count histograms (top, y-axis) and plots of the distance (Δ) between 5′ and 3′ mapped coordinates (bottom, y-axis) when HPV-only ONT reads were aligned against the HPV16 reference genome for tumor 1 (A), tumor 2 (B), tumor 3 (C), and VU147 cell line (D). X-axis, top and bottom panels, ONT read lengths in kb; n, number of aligned ONT reads. Bottom, heat map, read counts. E, Schematic depicting distance Δ between read 5′ and 3′ ends (based on half-maximal genome unit circumference, 7,906 bp ÷ 2). Top and bottom, two ONT reads (gray) aligned against a one-unit circle of the HPV16 genome (red). F, Representative ONT reads from samples in AD aligned against concatemeric HPV genomes. X-axis, dashed lines, ∼7.9-kb HPV genome unit length; black arrows, orientation of HPV genome from coordinates 1 to 7,906. G, Dot plots depict alignments (light gray) of representative ONT reads from VU147 cells of variable lengths (x-axis) against one ∼7.9-kb HPV genome unit (y-axis, arrow). DUP, duplications; DEL, deletions; INV, inversions; colored circles, sites of discordant or split reads supporting a breakpoint. H, Virus-only VU147 ONT reads are shown as block diagrams (top) and breakpoint plots (bottom), grouped by the presence of unique virus–virus breakpoints. Red lines, HPV genome (vertical black ticks, HPV reference coordinate 0; vertical white ticks, HPV rearrangement); colored dots, numbers, and inset key, breakpoints; and numbers below block diagrams, group-defining breakpoints. See also Supplementary Figs. S2.2 and S2.3.
Figure 2. HPV integration induced intratumoral heterogeneity and clonal evolution. Analysis of LR-seq reads from tumor 4 revealed shared breakpoint patterns and extensive heterogeneity in virus–virus and virus–host DNA structures. A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites at (left to right) Chrs. 5p, 5q, and Xp and in the HPV16 genome, as indicated. Top, IGV browser display of (y-axis, blue) WGS coverage; middle (red) virus–host and (gray) host-host or virus–virus breakpoints at chromosomal coordinates. Bracketed numbers, range of aligned sequence read counts; numbers above WGS coverage, estimated copy number; circles, numbers, identifiers of each (top) segment-defining and (bottom) segment nondefining breakpoint (see Supplementary Table S2.1). Bottom, left, genomic segments defined by breakpoints (see Supplementary Table S2.2); right, HPV genes. B, ONT reads ≥20 kb are shown as (top) block diagrams and (bottom) breakpoint plots. Groups A1–A10 are defined by shared breakpoint patterns based on breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; white vertical lines, HPV rearrangement; colored blocks, host genome segment as indicated in A. Breakpoint plots within groups also display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; colored dots, numbers, inset key, breakpoints. Numbers in parentheses, counts of reads in group, from which representative reads were selected for presentation.
Figure 2.
HPV integration induced intratumoral heterogeneity and clonal evolution. Analysis of LR-seq reads from tumor 4 revealed shared breakpoint patterns and extensive heterogeneity in virus–virus and virus–host DNA structures. A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites at Chrs. 5p, 5q, and Xp and in the HPV16 genome as indicated. Top, Integrative Genomics Viewer browser display of WGS coverage (y-axis, blue); middle, virus–host (red) and host–host or virus–virus (gray) breakpoints at chromosomal coordinates. Bracketed numbers, range of aligned sequence read counts; numbers above WGS coverage, estimated copy number; and circles and numbers, identifiers of each segment-defining (top) and segment-nondefining (bottom) breakpoint (see Supplementary Table S2.1). Bottom, host genomic segments defined by breakpoints (see Supplementary Table S2.2) and host or HPV genes. Figure 2.(Continued) B, ONT reads ≥20 kb are shown as block diagrams (top) and breakpoint plots (bottom). Groups A1 to A10 are defined by shared breakpoint patterns based on breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; white vertical lines, HPV rearrangement; and colored blocks, host genome segment as indicated in A. Breakpoint plots within groups also display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; and colored dots, numbers, and inset key, breakpoints. Numbers in parentheses, counts of reads in group from which representative reads were selected for presentation.
Figure 3. A model of heterocateny depicts how groups of SVs could evolve from a common molecular ancestor. Block diagrams (e.g., A1, A2, and A3), representative ONT reads as in Fig. 2B; brackets, hypothetical intermediate structures; gray, deletions; green, insertions; tan, ecDNA excisions; dashed lines, circularized segments; circular arrow, amplification; block colors, segments defined in Fig. 2A and B.
Figure 3.
A model of heterocateny depicts how groups of SVs could evolve from a common molecular ancestor. Block diagrams (e.g., A1, A2, and A3), representative ONT reads as in Fig. 2B; brackets, hypothetical intermediate structures; gray, deletions; green, insertions; tan, ecDNA excisions; dashed lines, circularized segments; circular arrow, amplification; and block colors, segments defined in Fig. 2A and B.
Figure 4. Heterocateny disrupted the EP300 locus and Chr. 4p15 in tumor 2. A, Depths of sequencing coverage, estimated copy number, and HPV insertional breakpoints at (left to right) the EP300 gene locus at Chr. 22q13.2 and in the HPV16 genome as indicated (see legend of Fig. 2, A and Supplementary Tables S3.1 and S3.2 for more details). B, ONT reads of length ≥20 kb shown as (top) block diagrams or (bottom) breakpoint plots. Groups B1–B10 are defined by the breakpoint patterns per breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; white vertical lines, HPV rearrangement; arrowhead, inverse orientation; colored blocks, host genome segment as indicated in A. Breakpoint plots within groups display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; colored dots, numbers, inset key, breakpoints. Numbers in parentheses, counts of reads in group, from which representative reads were selected for presentation. C, Depths of sequencing coverage, estimated copy number, and virus–host breakpoints at Chr. 4p15 in tumor 2 as per A. D, Block diagram (top) for a virus–host concatemer in icDNA in Chr. 4 supported by (bottom) representative LR-seq reads ≥20 kb depicted as breakpoint plots. Breakpoint 17 is shared by concatemers at both chromosomal loci.
Figure 4.
Heterocateny disrupted the EP300 locus and Chr. 4p15 in tumor 2. A, Depths of sequencing coverage, estimated copy number, and HPV insertional breakpoints at the EP300 gene locus at Chr. 22q13.2 and in the HPV16 genome (left to right) as indicated (see legend of Fig. 2A and Supplementary Tables S3.1 and S3.2 for more details). B, ONT reads of length ≥20 kb shown as block diagrams (top) or breakpoint plots (bottom). Groups B1 to B10 are defined by the breakpoint patterns per breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; white vertical lines, HPV rearrangement; arrowhead, inverse orientation; and colored blocks, host genome segment as indicated in A. Breakpoint plots within groups display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; and colored dots, numbers, and inset key, breakpoints. Numbers in parentheses, counts of reads in group from which representative reads were selected for presentation. C, Depths of sequencing coverage, estimated copy number, and virus–host breakpoints at Chr. 4p15 in tumor 2 as per A. D, Block diagram for a virus–host concatemer in icDNA in Chr. 4 (top) supported by representative LR-seq reads ≥20 kb depicted as breakpoint plots (bottom). Breakpoint 17 is shared by concatemers at both chromosomal loci.
Figure 5. Intratumoral heterogeneity and clonal evolution are observed in LR-seq reads at MYC in GUMC-395 cells. A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites at (left to right) Chr. 8q24.21 (MYC and PVT1 genes) and in HPV16, as indicated (see the legend of Fig. 2A and Supplementary Tables S4.1 and S4.2 for more details). B, ONT reads of length ≥20 kb are shown as (top) block diagrams or (bottom) breakpoint plots. Structural variant groups D1–D9 are defined by the breakpoint patterns per breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; colored blocks, host genome segment as indicated in A. Breakpoint plots within groups display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; colored dots, numbers, inset key, breakpoints. Numbers in parentheses, counts of reads in group, from which representative reads were selected for presentation. C, Schematic depicts the potential evolution of structural variant groups in B from a common molecular ancestor. Black X, site of potential homologous recombination; brackets, hypothetical intermediate structures; gray, deletions; green, insertions; tan, ecDNA excisions; dashed lines, circularized segments; circular arrow, amplification; block colors, segments defined in A. D, Schematic supported by LR-seq reads depicts a stepwise model by which insertion of a virus–host concatemer containing MYC is followed by Chr. 8 duplication, inversion of Chr. 8q, chromosomal translocation between centromeres of Chr. 8 and Chr. 21 resulting in t(8;21)(q24;q11), and duplication of this translocation. White arrowhead, inverse orientation.
Figure 5.
Intratumoral heterogeneity and clonal evolution are observed in LR-seq reads at MYC in GUMC-395 cells. A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites at Chr. 8q24.21 (MYC and PVT1 genes) and in HPV16 (left to right) as indicated (see the legend of Fig. 2A and Supplementary Tables S4.1 and S4.2 for more details). B, ONT reads of length ≥20 kb are shown as block diagrams (top) or breakpoint plots (bottom). SV groups D1 to D9 are defined by the breakpoint patterns per breakpoint IDs specified below block diagrams. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; and colored blocks, host genome segment as indicated in A. Breakpoint plots within groups display further heterogeneity characteristic of heterocateny. Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; and colored dots, numbers, and inset key, breakpoints. Numbers in parentheses, counts of reads in group from which representative reads were selected for presentation. C, Schematic depicts the potential evolution of SV groups in B from a common molecular ancestor. Black X, site of potential homologous recombination; brackets, hypothetical intermediate structures; gray, deletions; green, insertions; tan, ecDNA excisions; dashed lines, circularized segments; circular arrow, amplification; and block colors, segments defined in A. D, Schematic supported by LR-seq reads depicts a stepwise model by which insertion of a virus–host concatemer containing MYC is followed by Chr. 8 duplication, inversion of Chr. 8q, chromosomal translocation between centromeres of Chr. 8 and Chr. 21 resulting in t(8;21)(q24;q11), and duplication of this translocation. White arrowhead, inverse orientation.
Figure 6. HPV integration in HeLa cells and HTECs induced CNV, SV, and intrachromosomal rearrangements. Virus–host concatemers in icDNA lead to chromosomal instability in HeLa (A–D) and HTEC (E–G) cells. A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites in HeLa at (left to right) Chr. 8q24.21 (upstream of MYC) and in the HPV18 genome, as indicated (see the legend of Fig. 2A; Supplementary Tables S5.1 and S5.2 for more details). B and C, Top, block diagrams depicting concatemerized HPV integrants and rearrangements (B) integrated into flanking intrachromosomal segments at Chr. 8q24 and (C) joining Chr. 22 and Chr. 8 at a translocation. Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; arrowhead, inverse orientation; colored blocks, host genome segment as indicated in A. Bottom, breakpoint plots of representative ONT reads ≥20 kb supporting each block diagram. Many of the ONT reads demonstrate intrachromosomal integration as they directly connect concatemers with flanking host DNA segments A (left) and F (right). Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; colored dots, numbers, inset key, breakpoints. D, Stepwise model depicting molecular evolution of Chr. 8, starting with insertion of a virus–host concatemer (inset) into Chr. 8q24.21, likely by homologous recombination, followed by chromosomal translocation to the telomere of Chr. 22 and then to the centromere of Chr. 5. E, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites in HTEC at (left to right) Chr. 8q24.13 (upstream of MYC) and in the HPV16 genome, as indicated (see legend of Fig. 2A and Supplementary Tables S5.5 and S5.6 for more details. F, ONT reads (bottom, breakpoint plots) supporting integration of a virus–host concatemer in icDNA at Chr. 8q24.13 (top, block diagram). G, Left to right, stepwise model depicting molecular evolution of Chr. 8 in HTEC in vitro, starting with insertion of a virus–host concatemer (inset) into Chr. 8q24.13, likely by homologous recombination, followed by chromosomal duplication and development of isochromosome 8.
Figure 6.
HPV integration in HeLa cells and HTECs induced CNV, SV, and intrachromosomal rearrangements. Virus–host concatemers in icDNA lead to chromosomal instability in HeLa cells (AD) and HTECs (EG). A, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites in HeLa at Chr. 8q24.21 (upstream of MYC) and in the HPV18 genome (left to right) as indicated (see the legend of Fig. 2A and Supplementary Tables S5.1 and S5.2 for more details). B and C, Top, block diagrams depicting concatemerized HPV integrants and rearrangements integrated into flanking intrachromosomal segments at Chr. 8q24 (B) and joining Chr. 22 and Chr. 8 at a translocation (C). Red blocks, HPV genome; vertical black lines, HPV reference coordinate 0; arrowhead, inverse orientation; colored blocks, host genome segment as indicated in A. Bottom, breakpoint plots of representative ONT reads ≥20 kb supporting each block diagram. Many of the ONT reads demonstrate intrachromosomal integration as they directly connect concatemers with flanking host DNA segments A (left) and F (right). Red lines, HPV genome; vertical red ticks, HPV reference coordinate 0; gray lines, host DNA segments; and colored dots, numbers, and inset key, breakpoints. D, Stepwise model depicting molecular evolution of Chr. 8, starting with insertion of a virus–host concatemer (inset) into Chr. 8q24.21, likely by homologous recombination, followed by chromosomal translocation to the telomere of Chr. 22 and then to the centromere of Chr. 5. E, Depths of sequencing coverage, estimated copy number, and breakpoints at HPV integration sites in HTECs at Chr. 8q24.13 (upstream of MYC) and in the HPV16 genome (left to right) as indicated (see legend of Fig. 2A and Supplementary Tables S5.5 and S5.6 for more details). F, ONT reads (bottom, breakpoint plots) supporting integration of a virus–host concatemer in icDNA at Chr. 8q24.13 (top, block diagram). G, Left to right, stepwise model depicting molecular evolution of Chr. 8 in HTEC in vitro, starting with insertion of a virus–host concatemer (inset) into Chr. 8q24.13, likely by homologous recombination, followed by chromosomal duplication and development of isochromosome 8.
Figure 7. A model of HPV heterocateny development, depicting highly diverse but related genomic rearrangements including CNVs and SVs at HPV integration sites, is derived from multiple lines of evidence. (1) Rolling-circle replication of HPV episomes results in (2) unstable virus genome ecDNA concatemers that (3) acquire structural rearrangements and (4) integrate into chromosomes at sites of double-strand DNA breaks. (5) Dynamic excision of virus with captured host DNA leads to (6) serial rounds of amplification of ecDNA by rolling-circle or RDR and recombination events between host and/or HPV segments in the same cells, driving (7) HPV heterocateny and thus intratumoral heterogeneity and clonal evolution. (8) Insertion of ecDNA by recombination into chromosomes (likely through homology-directed repair) can induce (9) chromosomal inversions (INV) and translocations (TRA). (10) Occasional additional rounds of excision may produce more diverse HPV ecDNAs.
Figure 7.
A model of HPV heterocateny development, depicting highly diverse but related genomic rearrangements including CNVs and SVs at HPV integration sites, is derived from multiple lines of evidence. (1) Rolling-circle replication of HPV episomes results in (2) unstable virus genome ecDNA concatemers that (3) acquire structural rearrangements and (4) integrate into chromosomes at sites of double-strand DNA breaks. (5) Dynamic excision of virus with captured host DNA leads to (6) serial rounds of amplification of ecDNA by rolling-circle or RDR and recombination events between host and/or HPV segments in the same cells, driving (7) HPV heterocateny and thus intratumoral heterogeneity and clonal evolution. (8) Insertion of ecDNA by recombination into chromosomes (likely through homology-directed repair) can induce (9) chromosomal inversions (INV) and translocations (TRA). (10) Occasional additional rounds of excision may produce more diverse HPV ecDNAs. DUP, duplications.

Comment in

References

    1. Forman D, de Martel C, Lacey CJ, Soerjomataram I, Lortet-Tieulent J, Bruni L, et al. . Global burden of human papillomavirus and related diseases. Vaccine 2012;30Suppl 5:F12–23. - PubMed
    1. Akagi K, Li J, Broutian TR, Padilla-Nash H, Xiao W, Jiang B, et al. . Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability. Genome Res 2014;24:185–99. - PMC - PubMed
    1. Cancer Genome Atlas Research Network. Integrated genomic and molecular characterization of cervical cancer. Nature 2017;543:378–84. - PMC - PubMed
    1. Parfenov M, Pedamallu CS, Gehlenborg N, Freeman SS, Danilova L, Bristow CA, et al. . Characterization of HPV and host genome interactions in primary head and neck cancers. Proc Natl Acad Sci U S A 2014;111:15544–9. - PMC - PubMed
    1. Symer DE, Akagi K, Geiger HM, Song Y, Li G, Emde AK, et al. . Diverse tumorigenic consequences of human papillomavirus integration in primary oropharyngeal cancers. Genome Res 2022;32:55–70. - PMC - PubMed

Publication types