Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec;25(12):1910-20.
doi: 10.1101/gr.191049.115. Epub 2015 Sep 22.

Enhanced virome sequencing using targeted sequence capture

Affiliations

Enhanced virome sequencing using targeted sequence capture

Todd N Wylie et al. Genome Res. 2015 Dec.

Abstract

Metagenomic shotgun sequencing (MSS) is an important tool for characterizing viral populations. It is culture independent, requires no a priori knowledge of the viruses in the sample, and may provide useful genomic information. However, MSS can lack sensitivity and may yield insufficient data for detailed analysis. We have created a targeted sequence capture panel, ViroCap, designed to enrich nucleic acid from DNA and RNA viruses from 34 families that infect vertebrate hosts. A computational approach condensed ∼1 billion bp of viral reference sequence into <200 million bp of unique, representative sequence suitable for targeted sequence capture. We compared the effectiveness of detecting viruses in standard MSS versus MSS following targeted sequence capture. First, we analyzed two sets of samples, one derived from samples submitted to a diagnostic virology laboratory and one derived from samples collected in a study of fever in children. We detected 14 and 18 viruses in the two sets, comprising 19 genera from 10 families, with dramatic enhancement of genome representation following capture enrichment. The median fold-increases in percentage viral reads post-capture were 674 and 296. Median breadth of coverage increased from 2.1% to 83.2% post-capture in the first set and from 2.0% to 75.6% in the second set. Next, we analyzed samples containing a set of diverse anellovirus sequences and demonstrated that ViroCap could be used to detect viral sequences with up to 58% variation from the references used to select capture probes. ViroCap substantially enhances MSS for a comprehensive set of viruses and has utility for research and clinical applications.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Taxonomic distribution of target genomes included in ViroCap. Shown are the viral groups and families included in the ViroCap targeted sequence capture panel. A highlighted subset illustrates underlying genera. To view complete genera for all families, see Supplemental Figure S1A. Taxonomic assignments were obtained from the NCBI Taxonomy Viewer (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?opt=virus&taxid=10239).
Figure 2.
Figure 2.
Targeted sequence capture enrichment. Examples are given showing the impact of targeted sequence capture on breadth and depth of genome coverage for eight representative viral genomes (AH). For illustrative purposes, all of the coverage panels in this figure have been normalized by removing (deduplicating) reads based on identical alignment start-sites. Nucleotide positions along the reference genome are shown on the x-axis. The depth of deduplicated reads is shown on the y-axis. The shaded portion indicates the sequence coverage (breadth and depth) for each virus. Post-capture sequence coverage is represented in the larger panels in blue; precapture sequence coverage is shown in the insets in red. Note that y-axis ranges are different for each panel. At the top of each panel is shown the breadth of coverage (BoC) for the sample. The header of each panel includes breadth of coverage gain (BoC gain), sample id, and reference genome name and NCBI version number. BoC gain is calculated by subtracting the percentage of the length of the reference genome that was covered by sequence reads in precapture MSS from the percentage of the length of the reference genome covered by post-capture sequence reads.
Figure 3.
Figure 3.
Targeted sequence capture identifies divergent sequences. (A) The percentage identity of the top high-scoring segment pair (HSP) identified from the BLAST alignment of anellovirus contig sequences to the references used to design ViroCap is plotted on the y-axis. The x-axis represents the percentage of the length of the anellovirus contig covered after targeted sequence capture. (B) This coverage plot represents the sequence coverage of a divergent anellovirus contig sequence. The figure is designed as described in the figure legend for Figure 2, with the following addition: The post-capture coverage plot is shaded to show regions of nucleotide sequence variation between the anellovirus contig and the most similar reference genome in the ViroCap panel. Dark shading represents areas of identical sequence, and each position with nucleotide mismatch between aligned sequences is shown in the lighter color. All of the HSPs are shown, rather than just the top HSP.

References

    1. Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, et al. 2007. Direct selection of human genomic loci by microarray hybridization. Nat Methods 4: 903–905. - PubMed
    1. Allander T, Emerson SU, Engle RE, Purcell RH, Bukh J. 2001. A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species. Proc Natl Acad Sci 98: 11609–11614. - PMC - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402. - PMC - PubMed
    1. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Fernandes GR, Tap J, Bruls T, Batto J-M, et al. 2011. Enterotypes of the human gut microbiome. Nature 473: 174–180. - PMC - PubMed
    1. Baldwin DA, Feldman M, Alwine JC, Robertson ES. 2014. Metagenomic assay for identification of microbial pathogens in tumor tissues. MBio 5: e01714–14. - PMC - PubMed

Publication types

Associated data