Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 12;50(14):e83.
doi: 10.1093/nar/gkac341.

vRhyme enables binning of viral genomes from metagenomes

Affiliations

vRhyme enables binning of viral genomes from metagenomes

Kristopher Kieft et al. Nucleic Acids Res. .

Abstract

Genome binning has been essential for characterization of bacteria, archaea, and even eukaryotes from metagenomes. Yet, few approaches exist for viruses. We developed vRhyme, a fast and precise software for construction of viral metagenome-assembled genomes (vMAGs). vRhyme utilizes single- or multi-sample coverage effect size comparisons between scaffolds and employs supervised machine learning to identify nucleotide feature similarities, which are compiled into iterations of weighted networks and refined bins. To refine bins, vRhyme utilizes unique features of viral genomes, namely a protein redundancy scoring mechanism based on the observation that viruses seldom encode redundant genes. Using simulated viromes, we displayed superior performance of vRhyme compared to available binning tools in constructing more complete and uncontaminated vMAGs. When applied to 10,601 viral scaffolds from human skin, vRhyme advanced our understanding of resident viruses, highlighted by identification of a Herelleviridae vMAG comprised of 22 scaffolds, and another vMAG encoding a nitrate reductase metabolic gene, representing near-complete genomes post-binning. vRhyme will enable a convention of binning uncultivated viral genomes and has the potential to transform metagenome-based viral ecology.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flowchart of vRhyme workflow and methodology. Scaffolds are compared pairwise by read coverage effect size differences using single or multiple samples (top-left), followed by sequence feature distance comparisons (top-right). Multiple iterations of network clustering of putative bins are generated with edge weights representing normalized coverage effect size and supervised machine learning probabilities of sequence feature similarity (center). The bins are refined by KMeans clustering, and the best set of bins from a single iteration are identified after identifying protein redundancy and scoring (bottom).
Figure 2.
Figure 2.
Benchmarking performance metrics of vRhyme compared to MetaBat2, VAMB, CoCoNet, CONCOCT and BinSanity. Each boxplot represents the results of nine different datasets, except for VAMB in which three datasets are shown. In total, 999 non-redundant genomes artificially split into 4,324 sequence fragments are shown. For some plots, a dotted line is shown at 1.0 to indicate optimal performance. CONCOCT and BinSanity are partially shown on the Genome-to-Bin Ratio plot for better visualization; each yielded an average ratio >2.0.
Figure 3.
Figure 3.
Impact of binning with vRhyme on the benchmarking datasets. For (A–C), the putatively complete unsplit input genomes, generated sequence fragments, binning sequence fragments, and vRhyme bins (vMAGs) are compared. (A) Estimation of genome completeness using CheckV. (B) Sequence or vMAG nucleotide length. For (A, B) each dot represents a single sequence or vMAG. (C) Estimation of taxonomy at the family level using a custom analysis script. ‘unassigned’ represents a taxonomic classification to a group with an unassigned family, ‘ambiguous’ represents equal assignment to multiple families (typically Caudoviricetes), and ‘unknown’ represents the inability to make a prediction. (D) Evaluation of vRhyme, MetaBat2, VAMB, and CoCoNet for the binning of complete genomes. The expectation is that complete genomes should remain unbinned as uncultivated virus genomes (UViGs).
Figure 4.
Figure 4.
Benchmark binning and genome completeness evaluation of GOV2. Comparison of vRhyme, MetaBat2, and CoCoNet (A) raw results and (B) low contamination filtering results by the number of scaffolds binned and identified redundancy. For vRhyme only, CheckV was used to identify (C) the estimated completeness values, (D) number of ‘NA’ completeness values, (E) number of ‘no viral genes’ scaffolds/vMAGs and (F) number of ‘longer than expected’ scaffolds/vMAGs for the low contamination results of individual binned scaffolds as well as vMAGs.
Figure 5.
Figure 5.
Binning improves and expands the analysis of viruses from human skin. (A) Comparison of the number of original viral scaffolds identified across all individuals before and after binning. (B) Heatmap of coverage for the seven common bins per individual. (C) Genome visualization and alignment of Herelleviridae reference phiSA_BS2 (outer) and Tw bin 8 (inner). Each arrow represents a predicted open reading frame and black bars are artificial connections between vMAG scaffolds. (D) Alignment of vRhyme Vf bin 113 to the closest reference virus Siphoviridae isolate ctiXA4 (BK057074.1). Each of the four scaffolds were independently aligned by tBLASTx similarity. The narG AMG is labeled in yellow and viral hallmark annotations are labeled in light blue. (E) Representative cluster from all input viral scaffolds generated by vConTACT2, with the four Vf bin 113 scaffolds labeled in green. There are no connections between any of the four green scaffolds. Each dot represents a single scaffold. (F) Partial network from all vRhyme binned and unbinned viral scaffolds generated by vConTACT2, with vMAG bins labeled in orange and Vf bin 113 in green. For (E, F), Complete network diagrams can be found in Supplementary Figures S4 and S5.

Similar articles

Cited by

References

    1. Drew G.C., Stevens E.J., King K.C.. Microbial evolution and transitions along the parasite–mutualist continuum. Nat. Rev. Microbiol. 2021; 19:623–638. - PMC - PubMed
    1. Roossinck M.J. Move over, bacteria! Viruses make their mark as mutualistic microbial symbionts. J. Virol. 2015; 89:6532–6535. - PMC - PubMed
    1. Barr J.J. Missing a phage: unraveling tripartite symbioses within the human gut. Msystems. 2019; 4:e00105-19. - PMC - PubMed
    1. Hurwitz B.L., U’Ren J.M. Viral metabolic reprogramming in marine ecosystems. Curr. Opin. Microbiol. 2016; 31:161–168. - PubMed
    1. Howard-Varona C., Lindback M.M., Bastien G.E., Solonenko N., Zayed A.A., Jang H., Andreopoulos B., Brewer H.M., Rio T.G.del, Adkins J.N.et al.. Phage-specific metabolic reprogramming of virocells. ISME J. 2020; 14:881–895. - PMC - PubMed

Publication types