Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 6;51(D1):D733-D743.
doi: 10.1093/nar/gkac1037.

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata

Affiliations

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata

Antonio Pedro Camargo et al. Nucleic Acids Res. .

Abstract

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. A web interface enables users to efficiently browse and search viruses based on genome features and/or sequence similarity. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. These clustered into 8.7 million viral operational taxonomic units, including 231 408 with at least one high-quality representative. Viral sequences in IMG/VR are now systematically identified from genomes, metagenomes, and metatranscriptomes using a new detection approach (geNomad), and IMG standard annotation are complemented with genome quality estimation using CheckV, taxonomic classification reflecting the latest taxonomic standards, and microbial host taxonomy prediction. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
IMG/VR v4 composition regarding UViG origin, scaffold topology, taxonomic assignment and availability of host data. Each pie chart displays the number of UViGs from different types of datasets (‘Origin’), with different sequence topologies (‘Topology’), classified into an existing virus taxon at any rank (‘Taxonomy’), and with at least one host prediction available (‘Assigned host’).
Figure 2.
Figure 2.
(A) Fraction of high-confidence UViGs within IMG/VR v4 and the contribution of each of the completeness tiers. (B) Left: length distribution of UViGs classified as putative viruses and high-confidence viruses. Right: length distribution of the different completeness tiers within high-confidence predictions. (C) Accumulation curves of vOTUs across different subsets of IMG/VR. The top left panel shows the accumulation of vOTUs as a function of the number of UViGs for all sequences in IMG/VR v4, only high-confidence sequences, or for the sequences that were already present in IMG/VR v3. The top right panel shows similar accumulation curves considering only high-quality (i.e. high-confidence and ≥90% complete) UViGs. In this panel, for IMG/VR v3 high-quality UViGs, only sequences that were available through the IMG/VR v3 interface and are still available in the IMG/VR v4 database were included. Finally, the bottom panel shows similar accumulation curves separated by ecosystem and considering all sequences in IMG/VR v4. Each curve is the average of 50 random permutations, with the minimum and maximum value at each step indicated with a gray (top two panels) or colored (bottom panel) outline.
Figure 3.
Figure 3.
(A) Left: number of UViGs assigned to each taxonomic rank. For each UViG, only the most specific rank was considered. Right: Number of UViGs assigned to major taxa. Percentages over the bars represent the fraction of UViGs within each group (all UViGs, high-confidence viruses, high-quality viruses) that is represented by that bar. (B) Frequency of bacterial (top tree) and archaeal (bottom tree) hosts to which UViGs were assigned to. Counts for all UViGs (gray), high-confidence UViGs (orange), and high-quality UViGs (blue) are shown in separate columns. Relative frequencies for each of the host assignment methods are shown in the fourth column. Trees were retrieved from GTDB (release 207).
Figure 4.
Figure 4.
(A) Geographical distribution of the IMG/VR v4 viruses at the vOTU level based on the coordinates of IMG/M metagenomes and metatranscriptomes. Rings represent the total number of vOTUs within an area and filled circles represent the number of vOTUs with at least one high-confidence prediction. Data points are colored according to the number of samples where UViGs were detected. Samples were binned in fixed intervals across the longitude and latitude. UViGs identified in microbial genomes or imported from RefSeq are not represented. (B) Environmental distribution of major virus taxa. Bars represent the fraction of UViGs that were found within metagenomes and metatranscriptomes assigned to each of the major ecosystem classes.
Figure 5.
Figure 5.
(A) IMG/VR’s web interface allows users to browse UViGs according to multiple features related to the viral genome and the sample where it was identified. (B) Diverse metadata queries can be combined to search for UViGs. (C) The ‘Find similar UViGs’ tool allows users to find viruses with similar gene composition to a given query by identifying UViGs with similar sets of geNomad markers.

References

    1. Breitbart M., Rohwer F.. Here a virus, there a virus, everywhere the same virus?. Trends Microbiol. 2005; 13:278–284. - PubMed
    1. Koonin E.V., Dolja V.V., Krupovic M., Kuhn J.H.. Viruses defined by the position of the virosphere within the replicator space. Microbiol. Mol. Biol. Rev. 2021; 85:e00193-20. - PMC - PubMed
    1. Koonin E.V., Dolja V.V., Krupovic M., Varsani A., Wolf Y.I., Yutin N., Zerbini F.M., Kuhn J.H.. Global organization and proposed megataxonomy of the virus world. Microbiol. Mol. Biol. Rev. 2020; 84:e00061-19. - PMC - PubMed
    1. Sommers P., Chatterjee A., Varsani A., Trubl G.. Integrating viral metagenomics into an ecological framework. Annu. Rev. Virol. 2021; 8:133–158. - PubMed
    1. Greninger A.L. A decade of RNA virus metagenomics is (not) enough. Virus Res. 2018; 244:218–229. - PMC - PubMed

Publication types