Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 10;13(1):2563.
doi: 10.1038/s41467-022-30190-1.

Long-read sequencing unveils high-resolution HPV integration and its oncogenic progression in cervical cancer

Affiliations

Long-read sequencing unveils high-resolution HPV integration and its oncogenic progression in cervical cancer

Liyuan Zhou et al. Nat Commun. .

Abstract

Integration of human papillomavirus (HPV) DNA into the human genome is considered as a key event in cervical carcinogenesis. Here, we perform comprehensive characterization of large-range virus-human integration events in 16 HPV16-positive cervical tumors using the Nanopore long-read sequencing technology. Four distinct integration types characterized by the integrated HPV DNA segments are identified with Type B being particularly notable as lacking E6/E7 genes. We further demonstrate that multiple clonal integration events are involved in the use of shared breakpoints, the induction of inter-chromosomal translocations and the formation of extrachromosomal circular virus-human hybrid structures. Combined with the corresponding RNA-seq data, we highlight LINC00290, LINC02500 and LENG9 as potential driver genes in cervical cancer. Finally, we reveal the spatial relationship of HPV integration and its various structural variations as well as their functional consequences in cervical cancer. These findings provide insight into HPV integration and its oncogenic progression in cervical cancer.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Integration breakpoints detected in the human and HPV genomes.
a Distribution of integration breakpoints across human and virus genomes. The colored link line indicates which human chromosome and which virus gene the integration breakpoint had occurred on. Only HPV16-related breakpoints are shown. b Comparison of the observed and expected numbers of breakpoints in the viral genome. The expected number of breakpoints was calculated based on the assumption that breakpoints are uniformly distributed across the viral genome. Only HPV16-related breakpoints are shown (n = 61). P values were calculated by a two-sided binomial test. c Integration breakpoints are often clustered within the same sample in the human genome. Dots represent integration breakpoints and color keys indicate sample sources. Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Four types of integrated HPV DNA segments in clonal integration events.
a Type A, a truncated HPV genome containing E6/E7. b Type B, a truncated HPV genome lacking E6/E7. c Type C, an overflowing continuous segment containing the intact HPV genome. d Type D, a combination of Type A, Type B, or Type C. In each panel, the two dashed arrows on the top indicate the HPV DNA fragment that was integrated into the human genome; the light blue and orange boxes on the bottom represent the human genome and virus genome, respectively. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Schematic representations of integrated HPV DNA fragment in each clonal integration event.
The top rectangles represent a schematic diagram of the linearized HPV16 genome structure. The solid lines below the virus genome represent the DNA fragment of HPV that was integrated into the human genome in each clonal integration event. The dot at the line end represents the breakpoint in the viral genome. The line/dot color represents a respective type of integrated HPV DNA segments as follows: Black, Type A; Blue, Type B; Red, Type C; and Green, Type D. The note of “Amp” on the lines indicates that these DNA fragments are amplified in the clonal integration event. If there was more than one event in a sample, a designation “en” is added after the sample name to distinguish these events, where n codes 1 to number of events. Only HPV16-related clonal integration events are shown. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Breakpoints shared in clonal integration events and HPV integration-related inter-chromosomal translocations.
ac Schematic illustrations of authentic chimeric reads representing clonal integration events that participated in the use of shared breakpoints: a ZLR-08, b ZLR-11, and c ZLR-12. The orange box represents the virus genome; the blue and green boxes represent different chromosomes of the human genome. Identical breakpoints were designated by the same number and were linked by dashed lines. Clonal integration events in ZLR-11 and ZLR-12 induced translocations between chromosomes. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Extrachromosomal circular virus-human DNA structures.
a A chimeric read (middle) from ZLR-11_e1 was mapped to both human (top) and HPV16 (bottom) genomes. In top panel, the blue histogram shows base coverage of regions of the human genome around HPV integration, with indicated chromosomal loci (down, x-axis in megabases) and corresponding gene schematics (up). On the x axis, red bars indicate the positions of HPV insertion breakpoints in the human genome and green bars indicate the positions of two endpoints of the observed chimeric read (middle). The colored Arabic numerals indicate the order of these breakpoints in the human genome, which indicates human-derived sequences on both sides of the chimeric read were in reversed orientations. The eight black triangles indicate the regions selected for PCR validation in c below. In bottom panel, the colored regions in the HPV16 genome represent the sequences contained in the integrated structure and gray regions represent the sequences that were replaced or lost due to integration. b Extrachromosomal circular virus-human DNA structure. The two ends of the integrated HPV DNA fragment (orange box) are connected with the human sequence on both sides (light blue box) to form an ECC DNA structure (not drawn to scale). The unobserved region (~86 kb) was indicated by gray in the circular structure. c PCR validation of amplification of eight selected regions in the clonal integration event ZLR-11_e1. Three independent experiments give similar results. IDs of the selected regions are indicated at the top. L2, H2, J1, J2, and H3 are five regions in the ECC DNA structure that are supposed to be amplified. E2, H1, and H4 are the control regions around the target amplification regions of HPV and human, respectively. The numbers at the bottom are the predicted PCR product sizes. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Functional analysis of LENG9 in cervical cancer cell lines.
a DNA sequencing coverage depth around the target region involved in HPV integration. The blue histogram shows base coverage (y-axis) with indicated chromosomal loci (x axis) and corresponding gene schematics (top). Colored bars on the x axis indicate the positions of HPV insertion breakpoints in the human genome. Gray segments indicate the human genomic segments involved in integration events (target region). b Expression levels of three genes located in the target region in all tumors (n = 103) and adjacent normal samples (n = 39). Data are shown in boxplots. The thick line in the box is median and the box spans from Q1 (25th percentile) to Q3 (75th percentile). The whiskers extend to the most extreme observation within 1.5 times the interquartile range (IQR = Q3–Q1) from the nearest quartile. The gene expression level in ZLR-08 is highlighted by black dots. Significantly upregulated genes with adjusted p value < 0.05 are highlighted by dots of larger size. c The expression level of LENG9 was determined by qRT-PCR and western blot in cervical cancer lines CaSki and SiHa transfected with LENG9 overexpression or mock vector. Data are presented as mean values ± SEM (n = 3). df Proliferation, migration, and invasion assays of CaSki and SiHa transfected with LENG9 overexpression or mock vector. Scale bar, 100 μm. Quantification of migration and invasion cells was summarized as histograms. Data are presented as mean values ± SEM (n = 5 for proliferation assays and n = 3 for migration and invasion assays). Two sample t test was used for comparing the difference between overexpression and control groups. Source data are provided as a Source Data file.
Fig. 7
Fig. 7. The relationship between structural variations and HPV integration.
a SV counts in windows of different lengths compared between intRegion (n = 1232) and intSample (n = 88). Data are shown in boxplots. The thick line in the box is median and box spans from Q1 (25th percentile) to Q3 (75th percentile). The whiskers extend to the most extreme observation within 1.5 times the interquartile range (Q3–Q1) from the nearest quartile. b The relative proportion of SV types in different regions. c The SV size distribution of each SV type in different regions. d Distribution of normalized distance from SVs to nearest HPV integration sites. e Distribution of SV replication timing of each SV type in different regions. f Distribution of normalized distance from SVs to nearest TAD. g The distribution of SV pLI score in different regions for each SV type. h The composition of SV expression changes in different regions for each SV type. BND, breakend; DEL, deletion; DUP, duplication; INS, insertion; INV, inversion; INVDUP, inverted duplication. In each genomic feature, the SV types that have distinct distribution deviations between intSample and intRegion are highlighted with a blue arrow or asterisk. For each feature in every displayed SV type, the statistical significance of the difference between intSample and intRegion was summarized in Supplementary Table 6. Source data are provided as a Source Data file.

Similar articles

Cited by

References

    1. Sung H, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. - DOI - PubMed
    1. Walboomers JMM, et al. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. J. Pathol. 1999;189:12–19. doi: 10.1002/(SICI)1096-9896(199909)189:1<12::AID-PATH431>3.0.CO;2-F. - DOI - PubMed
    1. Zur Hausen H. Papillomaviruses in the causation of human cancers—a brief historical account. Virology. 2009;384:260–265. doi: 10.1016/j.virol.2008.11.046. - DOI - PubMed
    1. De Villiers E-M, Fauquet C, Broker TR, Bernard H-U, Zur Hausen H. Classification of papillomaviruses. Virology. 2004;324:17–27. doi: 10.1016/j.virol.2004.03.033. - DOI - PubMed
    1. Van Doorslaer K, et al. The Papillomavirus Episteme: a major update to the papillomavirus sequence database. Nucleic Acids Res. 2017;45:D499–D506. doi: 10.1093/nar/gkw879. - DOI - PMC - PubMed

Publication types