Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 22;7(1):15989.
doi: 10.1038/s41598-017-16262-z.

Long-Read Sequencing of Human Cytomegalovirus Transcriptome Reveals RNA Isoforms Carrying Distinct Coding Potentials

Affiliations

Long-Read Sequencing of Human Cytomegalovirus Transcriptome Reveals RNA Isoforms Carrying Distinct Coding Potentials

Zsolt Balázs et al. Sci Rep. .

Abstract

The human cytomegalovirus (HCMV) is a ubiquitous, human pathogenic herpesvirus. The complete viral genome is transcriptionally active during infection; however, a large part of its transcriptome has yet to be annotated. In this work, we applied the amplified isoform sequencing technique from Pacific Biosciences to characterize the lytic transcriptome of HCMV strain Towne varS. We developed a pipeline for transcript annotation using long-read sequencing data. We identified 248 transcriptional start sites, 116 transcriptional termination sites and 80 splicing events. Using this information, we have annotated 291 previously undescribed or only partially annotated transcript isoforms, including eight novel antisense transcripts and their isoforms, as well as a novel transcript (RS2) in the short repeat region, partially antisense to RS1. Similarly to other organisms, we discovered a high transcriptional diversity in HCMV, with many transcripts only slightly differing from one another. Comparing our transcriptome profiling results to an earlier ribosome footprint analysis, we have concluded that the majority of the transcripts contain multiple translationally active ORFs, and also that most isoforms contain unique combinations of ORFs. Based on these results, we propose that one important function of this transcriptional diversity may be to provide a regulatory mechanism at the level of translation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Genomic rearrangement of the HCMV isolate used in this study. Panel A shows schematic representations of the original Towne strain virus above) and the isolate used in our experiments (below). The unique long (UL) and unique short (US) sequences are bracketed by repeat sequences (ac), marked by coloured rectangles (brown, pink and green, respectively). In our isolate, the UL/b’ (180887–193945) region of the FJ616285 genome is substituted by the 710–11996 region. To confirm this rearrangement, primers have been designed to in the ends of the UL (Pfw) and b’ (Prev) regions 243 nt apart. Panel B shows the PCR product of approximately 250 nt. The original gel photo, from which Panel B was cropped is shown in Supplementary Fig. S1.
Figure 2
Figure 2
Transcriptome profiling pipeline. Long-consensus reads were aligned to the HCMV genome. Reads with a high mismatch or indel ratio (>5%) were discarded. All good quality reads were used to identify splice junctions. Only the deletions complying with the GT-AG rule were accepted as splice junctions. Reads containing 15 terminal (A) mismatches (i.e. a poly(A) tail) were considered for the validation of TESs. A TES was accepted as valid if two reads confirmed the same nucleotide position and the genomic region did not contain a stretch of 3 or more (A)s. Reads with a definite orientation were considered for the identification of TSSs. If the number of reads starting at a given genomic position was significantly higher than that would be expected according to the Poisson distribution, the genomic position was accepted as a TSS. Transcript isoforms were annotated based on reads containing the above mentioned annotated features.
Figure 3
Figure 3
The spread of TESs. The scatter plot shows the frequency of reads ending in the vicinity of true TESs (blue) and that of false TESs (red), which are genomic locations containing stretches of 3 (A)s or more. Error bars represent standard errors.
Figure 4
Figure 4
Read and transcript length distribution. The length distribution of reads aligning to the HCMV genome from the polyA selected (red) and random (blue) libraries are presented, together with the length distribution of the identified transcripts (black).
Figure 5
Figure 5
Transcript isoforms contain different ORFs. The figure shows an example of the differential peptide coding capacity (above) of transcript isoforms (below). Canonical ORFs are represented as arrows with a grey background, the other translationally active ORFs are represented as empty arrows and named as published by Stern-Ginossar et al. . Dotted vertical lines mark the translational start sites of the ORFs. The transcript isoforms of these two genes can be differentially translated due to polycistronism (US27-28 or only US28), alternative splicing (the splicing in US27 leads to the excision of 75 nucleotides and does not cause frameshift) or alternative transcription initiation (leading to a truncated protein in the cases of ORFS364W and US28).
Figure 6
Figure 6
Conservation of the novel transcripts. The nucleotide sequences of the genomic regions corresponding to the longest transcript isoforms were aligned to publicly available HCMV genomes. The similarities of these regions in other genomes compared to the sequence in FJ616285 are depicted in a boxplot. The whiskers represent the range of the data. In some cases, the median and the quartile values are the same.

References

    1. Rubin RH. Impact of Cytomegalovirus Infection on Organ Transplant Recipients. Clin. Infect. Dis. 1990;12:S754–S766. doi: 10.1093/clinids/12.Supplement_7.S754. - DOI - PubMed
    1. Emery VC, Lazzarotto T. Cytomegalovirus in pregnancy and the neonate. F1000Research. 2017;6:138. doi: 10.12688/f1000research.10276.1. - DOI - PMC - PubMed
    1. Davison AJ, et al. The human cytomegalovirus genome revisited: comparison with the chimpanzee cytomegalovirus genome. J. Gen. Virol. 2003;84:17–28. doi: 10.1099/vir.0.18606-0. - DOI - PubMed
    1. Dolan A, et al. Genetic content of wild-type human cytomegalovirus. J. Gen. Virol. 2004;85:1301–1312. doi: 10.1099/vir.0.79888-0. - DOI - PubMed
    1. Gatherer D, et al. High-resolution human cytomegalovirus transcriptome. Proc. Natl. Acad. Sci. USA. 2011;108:19755–60. doi: 10.1073/pnas.1115861108. - DOI - PMC - PubMed

Publication types

LinkOut - more resources