Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 23;50(17):e98.
doi: 10.1093/nar/gkac516.

Selective ablation of 3' RNA ends and processive RTs facilitate direct cDNA sequencing of full-length host cell and viral transcripts

Affiliations

Selective ablation of 3' RNA ends and processive RTs facilitate direct cDNA sequencing of full-length host cell and viral transcripts

Christian M Gallardo et al. Nucleic Acids Res. .

Abstract

Alternative splicing (AS) is necessary for viral proliferation in host cells and a critical regulatory component of viral gene expression. Conventional RNA-seq approaches provide incomplete coverage of AS due to their short read lengths and are susceptible to biases and artifacts introduced in prevailing library preparation methodologies. Moreover, viral splicing studies are often conducted separately from host cell transcriptome analysis, precluding an assessment of the viral manipulation of host splicing machinery. To address current limitations, we developed a quantitative full-length direct cDNA sequencing strategy to simultaneously profile viral and host cell transcripts. This nanopore-based approach couples processive reverse transcriptases with a novel one-step chemical ablation of 3' RNA ends (termed CASPR), which decreases ribosomal RNA reads and enriches polyadenylated coding sequences. We extensively validate our approach using synthetic reference transcripts and show that CASPR doubles the breadth of coverage per transcript and increases detection of long transcripts (>4 kb), while being functionally equivalent to PolyA+ selection for transcript quantification. We used our approach to interrogate host cell and HIV-1 transcript dynamics during viral reactivation and identified novel putative HIV-1 host factors containing exon skipping or novel intron retentions and delineated the HIV-1 transcriptional state associated with these differentially regulated host factors.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CASPR improves the specificity of oligo-d(T) primed RT when using total RNA inputs by reducing rRNA and increasing coverage evenness of protein-coding transcripts. (A) One percent agarose gel electrophoresis of double-stranded cDNA products that were reverse transcribed with oligo-d(T) priming with SSIV or MRT with no CDS enrichment (control), CASPR or PolyA+ selection. (B) cDNA yield of different RT and CDS enrichment combinations as measured spectrophotometrically. (C) Fraction of reads uniquely mapped to the listed references using nanopore sequencing. (D) Intragenic and intergenic read distributions. (E) Gene body coverage of protein-coding transcripts and (F) cumulative frequency distribution of gene body coverage. All values are means ± SEM. Statistical significance was calculated with two-way ANOVA with Tukey multiple comparison test: *P< 0.05; **P< 0.01; ***P< 0.001; ****P< 0.0001.
Figure 2.
Figure 2.
Assay and bioinformatic workflow for analytical performance validation. (A) Total RNA was isolated from Nalm6 cells and pooled into a single tube. Total RNA was spiked with SIRV-Set 4 at a concentration of 0.13 ng SIRVs per μg of total RNA. Three CDS enrichment conditions (control, CASPR and PolyA+ selection) were tested in parallel, all drawing identical RNA inputs from same total RNA sample in triplicate per condition. Following CDS enrichment, all samples are eluted in 17 μl of EB, followed by reverse transcription using 10 μl of input, and one-pot second-strand synthesis using a modified Gubler and Hoffman method. Following second-strand synthesis cleanup, identical volumes of double-stranded cDNA samples are then barcoded and prepared for sequencing using ONT Native Barcoding (EXP-NBD104, EXP-NBD-114) and Ligation Sequencing (SQK-LSK-109) Kits. Following library preparation, all samples are eluted in identical volumes of EB, and then equal volumes of each sample are pooled and sequenced via ONT MinION, using R9.4.1 chemistry. (B) Bioinformatic workflow used for data generation and analysis throughout manuscript. Typical outputs for each analysis are shown, with text in bold denoting specific tools used for a particular output.
Figure 3.
Figure 3.
Validation with synthetic reference standards shows that CASPR is functionally equivalent to PolyA+ selection, but results in higher cDNA yield, coverage evenness and capture of long transcripts. (A) Percent of reads uniquely mapped to SIRV reference sequences. (B) Correlations of gene expression TPM values with absolute input amounts of each synthetic transcript (in attomoles) for ERCC subsets. (C) Hg38 gene expression correlations between different RTs and CDS enrichment strategies. (D) Transcript discovery sensitivity calculation using FLAIR-derived transcriptome and hg38 gtf annotation file. (E) Efficiency of capture of long SIRVs of 4, 6 and 8 kb size classes. (F) Raw coverage visualized via IGV of all long SIRVs for each RT and CDS enrichment strategy combination. All samples were run in triplicate (n = 3). All values are means ± SEM. Statistical significance was calculated with two-way ANOVA with Tukey multiple comparison test: *P< 0.05; **P< 0.01; ***P< 0.001; ****P< 0.0001.
Figure 4.
Figure 4.
Evaluation of RT conditions and CDS enrichment strategies in capture of host and viral transcripts in cell line actively expressing HIV. (A) Host cell gene expression correlations for each CDS enrichment strategy when using SSIV and MRT. (B) Gene body coverage of protein-coding hg38 transcripts. (C) Frequency of host cell transcript lengths derived from the FLAIR isoform analysis pipeline, binned at 1000 bp intervals. (D) Coverage map of HIV reads. All samples were run in duplicate (n = 2). (E) Visualization of isoform structure of multiexonic HIV transcripts processed with the Pinfish pipeline. All values are means ± SEM. Statistical significance was calculated with two-way ANOVA with Tukey multiple comparison test: *P< 0.05; **P< 0.01; ***P< 0.001; ****P< 0.0001.
Figure 5.
Figure 5.
DIE analysis shows that putative HIV host factors PSAT1 and PSD4 are alternatively spliced in host cells upon HIV reactivation. (A) Heatmap showing hierarchically clustered TPM values for differentially expressed isoforms (P-value <0.1). Highly significant hits (Padj < 0.1) are in bold, while isoforms shaded in purple are also present in the Jurkat control group. (B) PSAT1 isoform lacking functionally important exon 8 is differentially downregulated upon HIV reactivation with TNF-α. (C) Unproductive PSD4 isoform containing novel intron retention event is predominantly expressed in J-Lat cells prior to HIV induction. Upon HIV reactivation with TNF-α, intron retention event is downregulated and productive isoform is upregulated.
Figure 6.
Figure 6.
HIV transcriptional signature, gene expression and splice acceptor/donor usage for TNF-α-induced viral reactivation in J-Lat 10.6 cells. (A) Idealized splicing structures of HIV genes and their CDS regions. (B) HIV multiexonic isoform clusters observed across four replicates are color coded based on count numbers; isoform clusters are annotated with likely gene expressed and differentiating splice acceptor junction. Noncanonical/novel isoforms are labeled with an asterisk. (C) Gene expression fractions calculated based on counts obtained per isoform cluster, gene assignment based on proximity of ORF to 5′ end and presence of undisrupted CDS. Splice (D) acceptor and (E) donor usage. (F) Splice junction matrix with log2 normalized counts shows association and frequency of specific splice donor/acceptor junctions.

Similar articles

Cited by

References

    1. Pan Q., Shai O., Lee L.J., Frey B.J., Blencowe B.J.. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 2008; 40:1413–1415. - PubMed
    1. Fu X.D., Ares M. Jr. Context-dependent control of alternative splicing by RNA-binding proteins. Nat. Rev. Genet. 2014; 15:689–701. - PMC - PubMed
    1. Nilsen T.W., Graveley B.R.. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010; 463:457–463. - PMC - PubMed
    1. Liu Y., Gonzàlez-Porta M., Santos S., Brazma A., Marioni J.C., Aebersold R., Venkitaraman A.R., Wickramasinghe V.O.. Impact of alternative splicing on the human proteome. Cell Rep. 2017; 20:1229–1241. - PMC - PubMed
    1. Wang E.T., Sandberg R., Luo S., Khrebtukova I., Zhang L., Mayr C., Kingsmore S.F., Schroth G.P., Burge C.B.. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008; 456:470–476. - PMC - PubMed

Publication types