Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May;86(10):5562-73.
doi: 10.1128/JVI.07196-11. Epub 2012 Feb 29.

Identification of novel positive-strand RNA viruses by metagenomic analysis of archaea-dominated Yellowstone hot springs

Affiliations

Identification of novel positive-strand RNA viruses by metagenomic analysis of archaea-dominated Yellowstone hot springs

Benjamin Bolduc et al. J Virol. 2012 May.

Abstract

There are no known RNA viruses that infect Archaea. Filling this gap in our knowledge of viruses will enhance our understanding of the relationships between RNA viruses from the three domains of cellular life and, in particular, could shed light on the origin of the enormous diversity of RNA viruses infecting eukaryotes. We describe here the identification of novel RNA viral genome segments from high-temperature acidic hot springs in Yellowstone National Park in the United States. These hot springs harbor low-complexity cellular communities dominated by several species of hyperthermophilic Archaea. A viral metagenomics approach was taken to assemble segments of these RNA virus genomes from viral populations isolated directly from hot spring samples. Analysis of these RNA metagenomes demonstrated unique gene content that is not generally related to known RNA viruses of Bacteria and Eukarya. However, genes for RNA-dependent RNA polymerase (RdRp), a hallmark of positive-strand RNA viruses, were identified in two contigs. One of these contigs is approximately 5,600 nucleotides in length and encodes a polyprotein that also contains a region homologous to the capsid protein of nodaviruses, tetraviruses, and birnaviruses. Phylogenetic analyses of the RdRps encoded in these contigs indicate that the putative archaeal viruses form a unique group that is distinct from the RdRps of RNA viruses of Eukarya and Bacteria. Collectively, our findings suggest the existence of novel positive-strand RNA viruses that probably replicate in hyperthermophilic archaeal hosts and are highly divergent from RNA viruses that infect eukaryotes and even more distant from known bacterial RNA viruses. These positive-strand RNA viruses might be direct ancestors of RNA viruses of eukaryotes.

PubMed Disclaimer

Figures

Fig 1
Fig 1
Selected examples of the results of screening hot springs for the presence of RNA templates in the RNA virus-enriched fractions. [32P]dCTP incorporation into RT-dependent first-strand cDNA synthesis is shown. For each hot spring sample, results are shown for prior RNase treatment of the sample (+RNase), which was performed as a control. Resampling of selected hot springs 12 months later demonstrated the maintenance of RNA signal in the viral fraction. The positive control was a known positive-strand ssRNA virus, cowpea chlorotic mottle virus (CCMV), and the negative control was water (H2O). Error bars show standard deviations.
Fig 2
Fig 2
Hierarchical classification of contigs with MG-RAST. Classification of cellular (A), viral DNA (B), and viral RNA (C) sequences based on the M5NR database in MG-RAST. Sequences with insufficient significance or those that did not match were classified as “Not Assigned/Unknown.”
Fig 3
Fig 3
Detection and single-stranded nature of RNA viral sequences within hot springs months after sampling for metagenomic analysis. Total nucleic acids of samples obtained 18 months after the original sampling were extracted from either the viral fraction or total cellular fraction of NL10 (B) and NL18 (A), DNase treated, and subjected to RT-PCR for metagenomic analysis. The detection and strand-specific nature of contig00009 (A) and contig00002 (B) were determined using contig-specific primer sets (see Table S2 in the supplemental material). Either forward (+), reverse (−), or both (+/−) primers were added to the initial first-strand cDNA synthesis mix and the mixture incubated as described in Materials and Methods. After first-strand synthesis, the complete primer pair was added (if necessary) for the PCR stage. The control, where reverse transcriptase was excluded from the procedure (−RT), and the molecular-size marker (bp) are indicated.
Fig 4
Fig 4
The RdRp of the putative archaeal viruses. (A) Multiple sequence alignment of the putative archaeal virus RdRps with homologs from positive-strand RNA viruses of eukaryotes and bacteria and reverse transcriptases from bacterial group 2 introns. The (nearly) universally conserved amino acid residues in the three signature motifs of the RdRps are highlighted in cyan, and partially conserved residues are highlighted in yellow. The top two sequences are from the putative archaeal viruses identified in this work. The rest of the sequences include representatives of the 29 clusters of viral RdRps identified as described in Materials and Methods. The two sequences at the bottom (RTG2) are reverse transcriptases. Each sequence is denoted by the GenBank identifier (GI number) and the name of the virus group (typically family) and subgroup (typically genus). The numbers denote the lengths (number of amino acids) in less-well-conserved regions between the conserved motifs and the distances between the ends of the respective proteins and the aligned segments. (B) Structural model of the central core region of the putative archaeal virus RdRp from contig00002 (blue) using the calicivirus RdRp as a template (red; PDB ID 3bso) (74).
Fig 5
Fig 5
The predicted capsid protein of the putative archaeal virus and its potential autoproteolytic activity. Multiple alignment of the putative archaeal virus capsid protein with the capsid protein sequences of nodaviruses, tetraviruses, and birnaviruses. Sequences are identified either by PDB ID or by GenBank GI numbers. Secondary structure is shown for the three available crystal structures denoted by the corresponding PDB codes (l, loop; H, helix; E, beta-strand). Alignment columns with homogeneity of 0.4 or greater (see Materials and Methods) are highlighted in yellow. The catalytic Asp is highlighted in cyan, and the cleavage site Asn is highlighted in green.
Fig 6
Fig 6
Phylogenetic tree of the RdRps. The unrooted phylogenetic tree was generated as described in Materials and Methods. Bootstrap support values greater than 0.5 are indicated for deep internal branches. Large groups of positive-strand RNA viruses from eukaryotes and bacteria (Levi group) and the bacterial retrotranscribing elements (RT) are indicated and shown by unique colors.
Fig 7
Fig 7
Analysis of cellular CRISPR direct repeat units and numbers of unique spacers matching archaeal species and RNA viral metagenome contigs. (A) Schematic representation of the CRISPR loci indicating the direct repeat structure interspaced with spacer regions. (B) The number of CRISPR spacer sequences with 100% match to viral RNA contigs and the alignment of the direct repeat structures (top sequence) with the closest sequenced organism (bottom sequence). Percentages of identity and E values are indicated.

Comment in

Similar articles

Cited by

References

    1. Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104–2105 - PubMed
    1. Ackermann HW. 2007. 5500 phages examined in the electron microscope. Arch. Virol. 152:227–243 - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic Local Alignment Search Tool. J. Mol. Biol. 215:403–410 - PubMed
    1. Altschul SF, et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402 - PMC - PubMed
    1. Angly FE, et al. 2006. The marine viromes of four oceanic regions. Plos Biol. 4:2121–2131 - PMC - PubMed

Publication types

MeSH terms

Associated data