Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 6;22(1):97.
doi: 10.1186/s13059-021-02307-0.

PCIP-seq: simultaneous sequencing of integrated viral genomes and their insertion sites with long reads

Affiliations

PCIP-seq: simultaneous sequencing of integrated viral genomes and their insertion sites with long reads

Maria Artesi et al. Genome Biol. .

Abstract

The integration of a viral genome into the host genome has a major impact on the trajectory of the infected cell. Integration location and variation within the associated viral genome can influence both clonal expansion and persistence of infected cells. Methods based on short-read sequencing can identify viral insertion sites, but the sequence of the viral genomes within remains unobserved. We develop PCIP-seq, a method that leverages long reads to identify insertion sites and sequence their associated viral genome. We apply the technique to exogenous retroviruses HTLV-1, BLV, and HIV-1, endogenous retroviruses, and human papillomavirus.

Keywords: BLV; Clonal expansion; HIV; HPV; HTLV-1; Integration site analysis; Long-read sequencing; NGS; Retrovirus; Viral genome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests

Figures

Fig. 1
Fig. 1
Overview of the PCIP-seq method. a Simplified outline of the method. Only 5′ LTR-containing circles and fragments are represented. Detailed outline available in Supplementary Methods. b A pool of CRISPR guide-RNAs targets each region, the region is flanked by PCR primers. Guides and primers adjacent to 5′ and 3′ LTRs are multiplexed. c As the region between the PCR primers is not sequenced, we created two sets of guides and primers (sets A and B). Following circularization, the sample is split, with CRISPR-mediated cleavage and PCR occurring separately for each set. After PCR, the products of the two sets of guides and primers are combined for sequencing. d Distribution of coverage across a BLV provirus (red line) and host DNA (blue line) in an expanded clone. Gray boxes: LTRs. The large drops in coverage adjacent to the LTRs correspond to the region between the PCR primers. The colored lines represent SNPs in the host genome
Fig. 2
Fig. 2
PCIP-seq applied to ATL. a In ATL100, both ligation-mediated PCR with Illumina sequencing (targeting the 5′ and 3′ LTRs) and PCIP-seq with Nanopore show a single predominant HTLV-1 insertion site. b Reads from both approaches have been mapped to a custom genome where the HLTV-1 provirus has been incorporated into the host genome. The long PCIP-seq Nanopore reads show this provirus has a ~ 3600 bp internal deletion, removing the binding sites of the guides/primers adjacent to the 5′ LTR. c Internal deletion confirmed via long range PCR and Illumina sequencing (gray reads map to a single position, the white reads map to both LTRs). d ATL2 clonality pie charts generated from ligation-mediated PCR with Illumina- and PCIP-seq-based sequencing data. The ATL2 tumor clone contains three proviruses inserted in chr 1, 5, and 16 (green, orange and blue slices respectively) named according to the chromosome inserted into. The provirus on chr1 (green slice) is inserted into a repetitive element (LTR) and short reads generated from host DNA flanking the insertion site by Illumina sequencing map to multiple positions in the genome. Filtering out multi-mapping reads causes an underestimation of the abundance of this insertion site (13.6%, left pie-chart). This can be partially corrected by retaining multi-mapping reads at this position (25.4%, central pie-chart). However, that approach can cause the potentially spurious inflation of other integration sites (red slice 9%). The long PCIP-seq reads can span repetitive elements and produce even coverage for each provirus without correction (right pie chart). e Screen shot from IGV shows representative PCIP-seq reads coming from the three proviruses (named chr 1, chr16, and chr5) and mapped to four distinct regions of the HTLV-1 proviral genome at positions where de novo mutations were observed
Fig. 3
Fig. 3
Variation in the BLV provirus. a Screen shot from IGV shows representative reads from a subset of the clones from each BLV-infected animal with a mutation in the first base of codon 303 in the viral protein Tax. Reads were mapped to the BLV proviral reference. Dotted red line shows approximate position within the BLV proviral genome represented below. b Structural variants observed in the BLV provirus. Deletions (blue bars) and duplications (red bars) in BLV proviruses identified in both ovine and bovine samples sequenced by PCIP-seq are represented below the BLV proviral genome
Fig. 4
Fig. 4
Location of HIV-1 proviral integration sites identified by PCIP-seq in patients on cART. a HIV-1 proviral integration sites identified by PCIP-seq in two HIV-1 patients (02006 and 06042). Black lines represent integration sites where the portion of the provirus sequence shows no evidence of a large deletion, and red lines indicate sites where a large deletion was observed in the provirus. Detailed information for each HIV-1 integration site identified by PCIP-seq is available in Additional file 2. b A hotspot of proviral integration in intron 1 of STAT5B. Arrows represent individual proviruses (02006 = blue, 06042 = orange), and direction indicates the orientation of the provirus. All proviruses have the same transcriptional orientation as STAT5B
Fig. 5
Fig. 5
Location of endogenous retroviruses identified by PCIP-seq in cattle and sheep genomes. Based on three cattle and two sheep. Black lines represent full-length proviruses, and red lines represent proviruses containing large deletions. Detailed information for each integration site identified by PCIP-seq is available in Tables S5 and S6
Fig. 6
Fig. 6
HPV integration site in an expanded clone. In this expanded clone HPV shows evidence of “looping” integration [18, 44] whereby noncontiguous genomic sequences are brought adjacent to one another. a PCIP-seq reads mapping to a ~ 87-kb region on chr3 revealed three HPV-host breakpoints. The large number of reads suggests expansion of the clone carrying these integrations. b PCR was carried out with primer pairs matching regions α and β, as well as α and γ. Both primer pairs produced a ~ 9 kb PCR product. Nanopore sequencing of the PCR products show the HPV genome connects these breakpoints. c Schematic of the breakpoints with the integrated HPV genome. This conformation indicates that this dramatic structural rearrangement in the host genome was generated via “looping” integration of the HPV genome

Similar articles

Cited by

References

    1. Bushman F, Lewinski M, Ciuffi A, Barr S, Leipzig J, Hannenhalli S, et al. Genome-wide analysis of retroviral DNA integration. Nat Rev Micro. 2005;3:848–858. doi: 10.1038/nrmicro1263. - DOI - PubMed
    1. Gillet NA, Malani N, Melamed A, Gormley N, Carter R, Bentley D, et al. The host genomic environment of the provirus determines the abundance of HTLV-1-infected T-cell clones. Blood. 2011;117:3113–3122. doi: 10.1182/blood-2010-10-312926. - DOI - PMC - PubMed
    1. Maldarelli F, Wu X, Su L, Simonetti FR, Shao W, Hill S, et al. Specific HIV integration sites are linked to clonal expansion and persistence of infected cells. Science. 2014;345:179–183. doi: 10.1126/science.1254194. - DOI - PMC - PubMed
    1. Wagner TA, McLaughlin S, Garg K, Cheung CYK, Larsen BB, Styrchak S, et al. HIV latency. Proliferation of cells with HIV integrated into cancer genes contributes to persistent infection. Science. 2014;345:570–573. doi: 10.1126/science.1256304. - DOI - PMC - PubMed
    1. Bruner KM, Wang Z, Simonetti FR, Bender AM, Kwon KJ, Sengupta S, et al. A quantitative approach for measuring the reservoir of latent HIV-1 proviruses. Nature. 2019;566:1–19. doi: 10.1038/s41586-019-0898-8. - DOI - PMC - PubMed

Publication types