Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 10;8(6):e66129.
doi: 10.1371/journal.pone.0066129. Print 2013.

A modified RNA-Seq approach for whole genome sequencing of RNA viruses from faecal and blood samples

Affiliations

A modified RNA-Seq approach for whole genome sequencing of RNA viruses from faecal and blood samples

Elizabeth M Batty et al. PLoS One. .

Abstract

To date, very large scale sequencing of many clinically important RNA viruses has been complicated by their high population molecular variation, which creates challenges for polymerase chain reaction and sequencing primer design. Many RNA viruses are also difficult or currently not possible to culture, severely limiting the amount and purity of available starting material. Here, we describe a simple, novel, high-throughput approach to Norovirus and Hepatitis C virus whole genome sequence determination based on RNA shotgun sequencing (also known as RNA-Seq). We demonstrate the effectiveness of this method by sequencing three Norovirus samples from faeces and two Hepatitis C virus samples from blood, on an Illumina MiSeq benchtop sequencer. More than 97% of reference genomes were recovered. Compared with Sanger sequencing, our method had no nucleotide differences in 14,019 nucleotides (nt) for Noroviruses (from a total of 2 Norovirus genomes obtained with Sanger sequencing), and 8 variants in 9,542 nt for Hepatitis C virus (1 variant per 1,193 nt). The three Norovirus samples had 2, 3, and 2 distinct positions called as heterozygous, while the two Hepatitis C virus samples had 117 and 131 positions called as heterozygous. To confirm that our sample and library preparation could be scaled to true high-throughput, we prepared and sequenced an additional 77 Norovirus samples in a single batch on an Illumina HiSeq 2000 sequencer, recovering >90% of the reference genome in all but one sample. No discrepancies were observed across 118,757 nt compared between Sanger and our custom RNA-Seq method in 16 samples. By generating viral genomic sequences that are not biased by primer-specific amplification or enrichment, this method offers the prospect of large-scale, affordable studies of RNA viruses which could be adapted to routine diagnostic laboratory workflows in the near future, with the potential to directly characterize within-host viral diversity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Coverage profiles of one Norovirus sample from amplicon and direct RNA sequencing.
A – Coverage across the genome for one Norovirus sample sequenced from PCR amplicons (others similar). Green and orange dotted lined mark the locations of the PCR primers used to generate the amplicons. B – coverage across the genome for the same Norovirus sample sequenced directly from RNA.
Figure 2
Figure 2. Coverage across the genome for two Hepatitis C samples sequenced directly from RNA.
Figure 3
Figure 3. Evolutionary tree created by BEAST (Bayesian evolutionary analysis sampling trees) depicting all the full genomic sequences with relatedness (61 sequences, excluding repeated pairs).
Clusters of genomes are visible among viruses sampled at similar points in time. Whole genome sequencing gives adequate resolution to distinguish potential divergent viral strains within the same time, as illustrated in clusters from January 2010, February 2011 and March 2011. WO = ward outbreak. Each node and branch has been coloured depicting the posterior probability supporting that clade calculated by Bayesian analysis (Dark Blue = 1 (high); Light Red = 0 (low)). Analysis was performed using BEAST v.1.7.5 combining two random number seed chains (10 million iterations each, saving 1 in 1000 iterations, with a 1 million iteration burn-in) using: HKY substitution; estimated frequency; strict clock; and constant population size coalescent tree prior. This maximum clade credibility tree was computed using TreeAnnotator v.1.7.5 and plotted with Figtree v.1.4.0.
Figure 4
Figure 4. Comparison of the different fragmentation methods.
A) fragment size distribution of a library prepared using the standard fragmentation method (red) and a library prepared using the new fragmentation method (blue). B) the coverage across the genome for the standard fragmentation sample (red) and the new fragmentation sample (blue). Data has been scaled as the difference from the median coverage for both samples.
Figure 5
Figure 5. Schematic representation of different strategies for viral genome resequencing.
A) Total RNA library: all the RNA species present in the sample are sequenced, no assumption on which genome is present, B) Hybridisation capture of a mRNA library: a good reference genome is needed to design the probes for capture, C) PCR enrichment: the desired genome is amplified from cDNA, a reference genome is needed to design specific oligos. Red lines, genomes of interest; Blue segments, Illumina adapters; Black lines, other RNA species.

References

    1. Pybus O, Rambaut A (2009) Evolutionary analysis of the dynamics of viral infectious disease. Nat Rev Genet 10: 540–550. - PMC - PubMed
    1. Yu Q, Ryan EM, Allen TM, Birren BW, Henn MR, et al. (2011) PriSM: a primer selection and matching tool for amplification and sequencing of viral genomes. Bioinformatics 27: 266–267. - PMC - PubMed
    1. Patel M, Widdowson M, Glass R, Akazawa K, Vinjé J, et al. (2008) Systematic literature review of role of noroviruses in sporadic gastroenteritis. Emerging Inf Dis 14: 1224–1231. - PMC - PubMed
    1. Eyre DW, Golubchik T, Gordon NC, Bowden R, Piazza P, et al. (2012) A pilot study of rapid benchtop sequencing of Staphylococcus aureus and Clostridium difficile for outbreak detection and surveillance. BMJ Open e001124. - PMC - PubMed
    1. Walker T, Ip C, Harrell R, Evans J, Kapatai G, et al. (2013) Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study. Lancet Infect Dis 13: 137–146. - PMC - PubMed

Publication types