Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 12;18(9):e1010797.
doi: 10.1371/journal.ppat.1010797. eCollection 2022 Sep.

Novel viral splicing events and open reading frames revealed by long-read direct RNA sequencing of adenovirus transcripts

Affiliations

Novel viral splicing events and open reading frames revealed by long-read direct RNA sequencing of adenovirus transcripts

Alexander M Price et al. PLoS Pathog. .

Abstract

Adenovirus is a common human pathogen that relies on host cell processes for transcription and processing of viral RNA and protein production. Although adenoviral promoters, splice junctions, and polyadenylation sites have been characterized using low-throughput biochemical techniques or short read cDNA-based sequencing, these technologies do not fully capture the complexity of the adenoviral transcriptome. By combining Illumina short-read and nanopore long-read direct RNA sequencing approaches, we mapped transcription start sites and RNA cleavage and polyadenylation sites across the adenovirus genome. In addition to confirming the known canonical viral early and late RNA cassettes, our analysis of splice junctions within long RNA reads revealed an additional 35 novel viral transcripts that meet stringent criteria for expression. These RNAs include fourteen new splice junctions which lead to expression of canonical open reading frames (ORFs), six novel ORF-containing transcripts, and 15 transcripts encoding for messages that could alter protein functions through truncation or fusion of canonical ORFs. In addition, we detect RNAs that bypass canonical cleavage sites and generate potential chimeric proteins by linking distinct gene transcription units. Among these chimeric proteins we detected an evolutionarily conserved protein containing the N-terminus of E4orf6 fused to the downstream DBP/E2A ORF. Loss of this novel protein, E4orf6/DBP, was associated with aberrant viral replication center morphology and poor viral spread. Our work highlights how long-read sequencing technologies combined with mass spectrometry can reveal further complexity within viral transcriptomes and resulting proteomes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. RNA-seq reveals high-confidence SNPs within the Ad5 genome.
The 35,938 base pair linear genome of Ad5 is displayed in the traditional left to right format. Major transcriptional units are shown as boxes above or below the genome with arrowheads denoting the orientation of the open reading frames (ORFs) encoded within. Grey boxes denote early gene transcriptional units while black boxes denote late genes. Bcftools was used to analyze short-read RNA seq data to predict single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) that approach 100% of the RNA reads when compared to the reference Ad5 genome (AC_000008). In total, 24 such SNPs were discovered and their positions within the genome is highlighted by red vertical lines. For each SNP, the nucleotide position as well as the top strand reference base and corrected base are shown in black text (nucleotide position, reference base -> corrected base). If indicated SNPs fell within untranslated regions (UTR), or did not change the encoded amino acid of any annotated reading frame potentially impacted by the SNP, these were marked with blue text denoting either UTR or Syn (synonymous mutation), respectively. For any SNP that led to an amino acid change within an annotated ORF, these ORFs as well as the identity of the reference amino acid and corrected amino acid are highlighted in red.
Fig 2
Fig 2. Combined short-read and long-read sequencing showcases adenovirus transcriptome complexity.
A549 cells were infected with Ad5 for 24 hours before RNA was extracted and subjected to both short-read and long-read sequencing. Sequence coverage provided by short-read stranded RNA-seq (Illumina, light blue), as well as nanopore long-read direct RNA-seq (Nanopore dRNA, dark blue), is shown along the Ad5 genome. For both tracks, reads aligning to the forward strand are plotted above the genome, while reads aligning to the reverse strand are shown below the genome. For dRNA-seq datasets, reads can be reduced to their 5’ and 3’ ends and peak-calling applied to predict individual transcription start sites (TSS, green vertical lines) or cleavage and polyadenylation sites (CPAS, magenta vertical lines), respectively. The TSS for the L4 promoter was only detected at 12 hpi and is thus displayed in brackets. Similarly, the ContextMap algorithm can predict, albeit at lower sensitivity, CPAS sites from poly(A) containing fragments within Illumina RNA-seq data (ContextMap, light blue vertical lines). Individual RNA transcripts are shown above and below the genome, with thin bars denoting 5’ and 3’ untranslated regions (UTR), thick bars denoting open reading frames (ORFs), and thin lines with arrowheads denoting both introns and orientation of transcription. Previously characterized early genes are denoted in grey, while previously characterized late genes are denoted in black. RNA isoforms discovered in this study are highlighted in red. Names of transcriptional units are shown under each cluster of transcripts, while the name of the protein derived from the respective ORF is listed after each transcript to the right or left. The position of Pol III-derived noncoding RNAs virus associated VA-I and VA-II are highlighted in teal boxes.
Fig 3
Fig 3. Direct RNA Sequencing (dRNA-seq) unambiguously distinguishes early and late transcription.
(A) dRNA-seq was performed on polyadenylated RNA from Ad5-infected A549 cells extracted at 12 hours post-infection (hpi). Sequence reads were aligned to the re-annotated transcriptome and filtered to retain only unambiguous primary alignments. Normalized read count indicates the number of RNAs for a particular transcript once normalized to the total number of mappable reads (human plus adenovirus) for the entire sequencing reaction. For all panels, grey bars indicate early genes, black bars indicate late genes, and red bars indicate novel isoforms discovered in this study. Undetectable transcripts (nd) or those with fewer than 10 counts of a particular isoform detected (<10) are indicated. (B) Same as in Panel A, but with RNA harvested at 24 hpi.
Fig 4
Fig 4. Novel fusion transcript between E4orf6 and DBP is expressed, translated, and conserved.
(A) Enlarged transcriptome map of Ad5 E4 and E2A transcriptional units. Promoter transcription start sites are indicated with left-facing arrows, and cleavage and polyadenylation sites (pA) labeled with downward facing arrows. Novel E4-derived transcripts that terminate in E2A are highlighted in red. Ad5 mutant viruses dl1004 and dl355 contain deletions (indicated by hashed boxes) that remove most of the E4 region and splice donors (ΔE4) or a 14-base deletion inside E4orf6 that only abrogates E4orf6 expression (ΔE4orf6). (B) Reverse-transcriptase PCR on cDNA derived from Ad5-infection of A549 cells reveals characteristic bands of both E4orf6/Unk and E4orf6/DBP. L denotes DNA ladder and triangle indicates increasing cDNA concentration. (C) A549 or W162 cells were uninfected (mock) or infected with WT Ad5, ΔE4orf6, or ΔE4 viruses for 40 hours and proteins detected by immunoblot analysis. When blotting with antisera raised against the N-terminus of E4orf6, a prominent band is detected at the predicted size of E4orf6/DBP. This band is absent during ΔE4 infection and not observed in W162 cells where only the E4 region is provided in trans. Kilodalton size markers are shown to the left of each blot. (D) Infections and immunoblot analysis were performed as described in panel C. Two independently derived anti-N-terminal E4orf6 antibodies (RSA3 and M45) detect E4orf6/DBP. (E) Proteins expressed over a time-course of Ad5 infection in A549 were detected by immunoblot analysis at indicated hpi. (F) A549 cells were infected with adenoviruses from four different serotypes in a time-course. All tested adenovirus serotypes express a protein corresponding to E4orf6/DBP. (G) Quantitative reverse-transcriptase PCR was performed to demonstrate mRNA accumulation of Ad5 E1A (a representative early transcript), Fiber (a representative late transcript), and three E4orf6 containing transcript isoforms. Transcripts were normalized to expression level at 48 hpi and internal HPRT1 housekeeping gene.
Fig 5
Fig 5. The E4orf6 protein generates an E3 ubiquitin ligase complex with E1B55K but this is not observed for the E4orf6/DBP fusion.
(A) Domain map of DBP and E4orf6 open reading frames. HR: host range, NLS: nuclear localization sequence, NES: nuclear export sequence, NRS: nuclear retention sequence. (B) HEK293 cells were transfected with E4orf6, flag-E4orf6, or flag-E4orf6/DBP and subjected to immunoblot analysis. E4orf6, but not E4orf6/DBP, induces degradation of Mre11 and Rad50 through a complex with E1B55K (C) HEK293 cells were transfected with flag-E4orf6 or flag-E4orf6/DBP and then subsequently mock-infected or Ad5-infected for 24 hours. Immunofluorescence was performed for flag tag (green) or cellular USP7 (magenta). Both proteins are nuclear but only E4orf6/DBP localizes to viral replication centers marked by USP7. (D) HEK293 cells were transfected with flag-E4orf6 or flag-E4orf6/DBP. Immunofluorescence was performed for flag tag (green) or integrated E1B55K (magenta). Only E4orf6 induces relocalization of E1B55K from cytoplasmic aggresomes to the nucleus. Dashed white lines outline the nuclear periphery. White scale bar denotes 10 μm.
Fig 6
Fig 6. Loss of E4orf6/DBP has minimal impact on viral genome replication or protein expression.
(A) E4orf6 splice donor is recognized by base-pairing to the cellular U1 snRNP. Silent mutations shown in red abrogate downstream splicing and were used to create the E4orf6/7, E4orf6/DBP double knockout virus (E4orf6ΔSS). (B) A549 cells were infected with WT Ad5, E4orf6ΔSS, or ΔE4orf6 (dl355) and harvested over a time-course of infection (hpi, hours post-infection). Immunoblot analysis was performed with antibodies to detect the indicated viral and cellular proteins. Kilodalton size markers are shown to the left of each blot. (C) Immunoblot analysis of a time-course infection was performed as in panel B. Cell lines were parent A549 or A549 transduced to express flag-E4orf6/DBP under control of the adenoviral E4 promoter. (D) A549 cells were infected with WT Ad5 or E4orf6ΔSS in biological triplicate. Adenoviral genome copy number was determined by qPCR and normalized to the amount of input genomes at 4 hpi. Significance was determined by unpaired, two-tailed t-test (n.s., not significant; **, p-value<0.01). (E) A549 cells were infected with WT Ad5, ΔE4orf6, or E4orf6ΔSS for 24 hours before immunofluorescence was performed. Cells were stained with antibodies against E4orf6 N terminal domain (RSA3, green) or cellular USP7 as a marker of viral replication centers (magenta). Dashed white lines outline the nuclear periphery. Open white arrowheads denote uninfected cells with diffuse USP7 and no viral staining, while closed arrowheads denote infected cells that lack E4orf6 staining but show USP7 at viral replication centers. White scale bar denotes 10 μm.
Fig 7
Fig 7. Loss of E4orf6/DBP leads to altered viral replication center morphology and small plaque phenotypes.
(A) Wildtype adenovirus infection leads to temporal progression of viral replication center (VRC) morphology progressing from early stage (0. Early) through three distinct stages of late replication centers (Late 1–3). A novel, large replication center morphology was seen with infection of E4orf6ΔSS virus. Representative immunofluorescence images for each stage are shown by staining A549 cells with VRC marker DBP (magenta) and DAPI for DNA (grey). (B) Parent A549 (Parent) or A549 cells expressing E4p-flag-E4orf6/DBP (E4orf6/DBP) were infected with WT Ad5 or E4orf6ΔSS virus for 24 hours. Viral replication centers were stained for DBP, and morphology was scored in a blinded manner using the key provided in panel A. The percentage of infected cells demonstrating each replication center morphology is shown as bar chart. Data are representative of three independent experiments. (C) HEK293 cells were infected with limiting dilution of WT Ad5 or E4orf6ΔSS for six days to allow the formation of plaques. Plaque formation was negative stained with crystal violet and imaged. Scale bar shows 5 mm. (D) WT Ad5 or E4orf6ΔSS (ΔSS) plaque size in HEK293 cells was quantified with ImageJ and normalized to the median plaque size in WT infection. Individual plaques are plotted as a single point. (E) Plaque size was quantified in Vero cells or Vero-W162 cells. W162 cells express the adenovirus E4 region in trans and complement the loss of E4orf6/7 in the ΔSS virus. (F) Plaque size was quantified in A549 cells or A549:E4orf6/DBP cells. The A549:E4orf6/DBP cells express E4orf6/DBP under control of the E4 promoter in trans, and thus complement the loss of E4orf6/DBP in the ΔSS virus. For all plaque assays, red error bars denote median and interquartile range. Statistical significance was performed using non-parametric Mann-Whitney t-test (n.s., not significant; ***, p-value<0.001).

Similar articles

Cited by

References

    1. Berk AJ. Adenoviridae. 6th ed. In: Knipe David M., Howley Peter M., editors. Fields Virology. 6th ed. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2013. pp. 1704–1731.
    1. Khanal S, Ghimire P, Dhamoon AS. The Repertoire of Adenovirus in Human Disease: The Innocuous to the Deadly. Biomedicines. 2018;6. doi: 10.3390/biomedicines6010030 - DOI - PMC - PubMed
    1. Chroboczek J, Bieber F, Jacrot B. The sequence of the genome of adenovirus type 5 and its comparison with the genome of adenovirus type 2. Virology. 1992;186: 280–285. doi: 10.1016/0042-6822(92)90082-z - DOI - PubMed
    1. Davison AJ, Benko M, Harrach B. Genetic content and evolution of adenoviruses. J Gen Virol. 2003;84: 2895–2908. doi: 10.1099/vir.0.19497-0 - DOI - PubMed
    1. Berk AJ. Recent lessons in gene expression, cell cycle control, and cell biology from adenovirus. Oncogene. 2005;24: 7673–85. doi: 10.1038/sj.onc.1209040 - DOI - PubMed

Publication types