Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 1;14(11):e1007314.
doi: 10.1371/journal.ppat.1007314. eCollection 2018 Nov.

A planarian nidovirus expands the limits of RNA genome size

Affiliations

A planarian nidovirus expands the limits of RNA genome size

Amir Saberi et al. PLoS Pathog. .

Abstract

RNA viruses are the only known RNA-protein (RNP) entities capable of autonomous replication (albeit within a permissive host environment). A 33.5 kilobase (kb) nidovirus has been considered close to the upper size limit for such entities; conversely, the minimal cellular DNA genome is in the 100-300 kb range. This large difference presents a daunting gap for the transition from primordial RNP to contemporary DNA-RNP-based life. Whether or not RNA viruses represent transitional steps towards DNA-based life, studies of larger RNA viruses advance our understanding of the size constraints on RNP entities and the role of genome size in virus adaptation. For example, emergence of the largest previously known RNA genomes (20-34 kb in positive-stranded nidoviruses, including coronaviruses) is associated with the acquisition of a proofreading exoribonuclease (ExoN) encoded in the open reading frame 1b (ORF1b) in a monophyletic subset of nidoviruses. However, apparent constraints on the size of ORF1b, which encodes this and other key replicative enzymes, have been hypothesized to limit further expansion of these viral RNA genomes. Here, we characterize a novel nidovirus (planarian secretory cell nidovirus; PSCNV) whose disproportionately large ORF1b-like region including unannotated domains, and overall 41.1-kb genome, substantially extend the presumed limits on RNA genome size. This genome encodes a predicted 13,556-aa polyprotein in an unconventional single ORF, yet retains canonical nidoviral genome organization and expression, as well as key replicative domains. These domains may include functionally relevant substitutions rarely or never before observed in highly conserved sites of RdRp, NiRAN, ExoN and 3CLpro. Our evolutionary analysis suggests that PSCNV diverged early from multi-ORF nidoviruses, and acquired additional genes, including those typical of large DNA viruses or hosts, e.g. Ankyrin and Fibronectin type II, which might modulate virus-host interactions. PSCNV's greatly expanded genome, proteomic complexity, and unique features-impressive in themselves-attest to the likelihood of still-larger RNA genomes awaiting discovery.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Genome sizes of nidoviruses.
(A) Timeline of discovery of largest RNA and DNA virus genomes versus accumulation of virus genome sequences in GenBank (1982–2017). PV, poliovirus; and nidoviruses: IBV, avian infectious bronchitis virus, MHV, mouse hepatitis virus, BWCoV, beluga whale coronavirus SW1, BPNV, ball python nidovirus and PSCNV, planarian secretory cell nidovirus. (B) Comparison of genome sizes between nidoviruses that do not encode an ExoN domain, and those that do. Percentage indicates the difference between sizes of PSCNV and the next-largest entity.
Fig 2
Fig 2. Genomes and proteomes of nidoviruses.
ORFs and encoded protein domains in genomes of viruses representing three nidovirus families and PSCNV. The protein-encoding part of the genomes is split in three adjacent regions, which are colored and labelled accordingly. EAV, equine arteritis virus; NDiV, Nam Dinh virus; SARS-CoV (see S1 Table for details on these viruses). ORF1a frame is set as zero. Protein domains conserved between these nidoviruses and PSCNV, and those specific to PSCNV are shown. TM, transmembrane domain (TM helices are shown by black bars above TM domains); Tandem repeats, two adjacent homologous regions of unknown function; RNase T2, ribonuclease T2 homolog; 3CLpro, 3C-like protease; NiRAN, nidovirus RdRp-associated nucleotidyltransferase; RdRp, RNA-dependent RNA polymerase; HEL1, superfamily 1 helicase with upstream Zn-binding domain (ZBD); ExoN, DEDDh subfamily exoribonuclease; N-MT and O-MT, SAM-dependent N7- and 2’-O-methyltransferases, respectively; Thr-rich, region enriched with Thr residue; FN2a/b, fibronectin type 2 domains; ANK, ankyrin domain.
Fig 3
Fig 3. Expression of PSCNV RNA in planarians.
(A) PSCNV RNA (blue) detected in asexual (left) and sexual S. mediterranea by whole-mount ISH. (B) Fluorescent ISH showing PSCNV expression in a sexual planarian. Insets show higher magnification of areas indicated by boxes. Top two insets are confocal projections. Secretory cell projections to lateral body edges are indicated by arrowheads. (C) Tiled confocal projections of PSCNV expression in a cross-section. Cells expressing PSCNV are ventrally located (arrowheads). Gut (“g”) and pharynx (“ph”) are indicated. DAPI (blue) labels nuclei.
Fig 4
Fig 4. Putative PSCNV particles revealed by electron microscopy.
(A) Adjacent histological transverse section, to orient EM images. Black rectangle corresponds to location of (B), a low-magnification EM view to provide context. White rectangle corresponds to location of (C), in which putative viral particles enclosed within membrane sacs are indicated by arrowheads. The white rectangle in (C) and square in (B) indicate positions of higher-magnification views shown in (D) and (E), respectively, each illustrating several viral particles within a membrane sac. In top-left of (C), note the mucus granules adjacent to virus-laden sacs (see also S2 Fig). Scale bars as indicated.
Fig 5
Fig 5. Largest proteins of nidoviruses and other RNA viruses in comparison with PSCNV polyprotein.
Percentage indicates the difference between sizes of the PSCNV polyprotein (pp) and that of the next-largest entity. For details, see S1 Materials and Methods.
Fig 6
Fig 6. ANK domain of PSCNV and its homologs.
The closest cellular homologs of PSCNV ANK are ranked by similarity (left, above the broken baseline) and depicted through phylogeny (right; reconstructed and rooted by BEAST, summarized as maximum clade credibility tree; PP, posterior probability of clades) along with protein domain architecture: S. med, Schmidtea mediterranea; D. lac, Dendrocoelum lacteum; RHD, Rel homology DNA-binding domain.
Fig 7
Fig 7. Phylogeny of PSCNV.
RdRp-based Bayesian maximum clade credibility tree and the genomic ORF organization (character state) for PSCNV, a representative set of nidoviruses, and astroviruses (outgroup). PP, posterior probability of clades. For virus names, see S1 Table.
Fig 8
Fig 8. Nidovirus genome and region size differences.
(A) Sizes of three nidovirus ORF regions. Percentage indicates the difference between a genome region’s size in PSCNV, and that of the next-largest entity. Color scheme as in Fig 2. (B) Size increase of the three genome regions in PSCNV (grey bars) relative to the increase expected if all regions had expanded evenly (broken line); calculated using formula D3, see text and S6 Table.
Fig 9
Fig 9. Genome translation.
Comparison of mechanisms by which ORFs 1a and 1b are translated in previously described nidoviruses (left) and PSCNV (right, hypothetical). On the top, RNA structure of the PRF sites, predicted by KnotInFrame, is presented: slippery sequence, pink; pseudoknot, blue.
Fig 10
Fig 10. Genome transcription.
(A) Mean depth of RNA-seq coverage along the PSCNV genome (approximated by exponential regression in ORF1b-like and 3’ORFs-like regions) calculated based on five datasets used to assemble the transcriptomes in which PSCNV was found [67]. Indicated on the genome map (colored as in Fig 2) are the positions of oligonucleotide repeats (leader and body TRSs) in the genome, and below is their alignment with a sg mRNA 5’-terminus identified by 5’-RACE (nucleotide mismatches between sg mRNA and TRSs are shown with grey backgrounds). (B) Predicted secondary structure of TRSs. TRSs are highlighted in green, region upstream of bTRS, interacting with its 5’-terminus–in yellow, asterisks indicate mismatching nucleotides of TRSs. (C) Model of discontinuous RNA synthesis mediated by TRSs and their secondary structure. The genome is represented by a solid line, and the nascent minus strand by a dashed line. Color code matches that of panel B.

References

    1. Joyce GF. The antiquity of RNA-based evolution. Nature. 2002;418(6894):214–21. 10.1038/418214a . - DOI - PubMed
    1. Leipe DD, Aravind L, Koonin EV. Did DNA replication evolve twice independently? Nucleic Acids Res. 1999;27(17):3389–401. ; PubMed Central PMCID: PMCPMC148579. - PMC - PubMed
    1. Poole AM, Logan DT. Modern mRNA proofreading and repair: clues that the last universal common ancestor possessed an RNA genome? Mol Biol Evol. 2005;22(6):1444–55. 10.1093/molbev/msi132 . - DOI - PMC - PubMed
    1. Xavier JC, Patil KR, Rocha I. Systems biology perspectives on minimal and simpler cells. Microbiol Mol Biol Rev. 2014;78(3):487–509. 10.1128/MMBR.00050-13 ; PubMed Central PMCID: PMCPMC4187685. - DOI - PMC - PubMed
    1. Li S, Guo W, Dewey CN, Greaser ML. Rbm20 regulates titin alternative splicing as a splicing repressor. Nucleic Acids Res. 2013;41(4):2659–72. 10.1093/nar/gks1362 ; PubMed Central PMCID: PMCPMC3575840. - DOI - PMC - PubMed

Publication types