Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 1;12(11):1248.
doi: 10.3390/v12111248.

Virosaurus A Reference to Explore and Capture Virus Genetic Diversity

Affiliations

Virosaurus A Reference to Explore and Capture Virus Genetic Diversity

Anne Gleizes et al. Viruses. .

Abstract

The huge genetic diversity of circulating viruses is a challenge for diagnostic assays for emerging or rare viral diseases. High-throughput technology offers a new opportunity to explore the global virome of patients without preconception about the culpable pathogens. It requires a solid reference dataset to be accurate. Virosaurus has been designed to offer a non-biased, automatized and annotated database for clinical metagenomics studies and diagnosis. Raw viral sequences have been extracted from GenBank, and cleaned up to remove potentially erroneous sequences. Complete sequences have been identified for all genera infecting vertebrates, plants and other eukaryotes (insect, fungus, etc.). To facilitate the analysis of clinically relevant viruses, we have annotated all sequences with official and common virus names, acronym, genotypes, and genomic features (linear, circular, DNA, RNA, etc.). Sequences have been clustered to remove redundancy at 90% or 98% identity. The analysis of clustering results reveals the state of the virus genetic landscape knowledge. Because herpes and poxviruses were under-represented in complete genomes considering their potential diversity in nature, we used genes instead of complete genomes for those in Virosaurus.

Keywords: HTS; bioinformatics; complete genome; database; diagnostics; sequencing; viral infections; viruses.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Workflow for the creation of the Virosaurus datasets. References are in blue, and output datasets are in green.
Figure 2
Figure 2
Examples of virus genome annotation. The usual name and clinical typing should be the default output for clinical studies and are shown in red.
Figure 3
Figure 3
Example of gathering reads for the same virus. In the left part, 10 isolates represent clusters for this virus. Twenty-eight reads show homology to those reference sequences, they can be all grouped under the “human polyomavirus 2” entity, thereby facilitating interpretation of results.
Figure 4
Figure 4
Relative number of sequences for the 13 most sequenced human viruses: (A) total sequences from GenBank, (B) complete virus sequences, (C) Virosaurus 98 and (D) Virosaurus 90. (Data from release 2019_10).
Figure 5
Figure 5
Percentage of sequence reduction by clustering complete genomes at 90%. (Data Virosaurus 2019_10).
Figure 6
Figure 6
Human blood samples were sequenced and reads generated using the RNA protocol [9] were aligned to the Virosaurus database. The result is easy to interpret and confirms that the patient was positive for a novel human astrovirus, HIV-1 and HHV-8 sequences, as previously reported [23]. Top panel: 2D representation of detected sequences with %segment coverage in the X-axis, and depth (median) in Y-axis; bottom panel: raw data. Size of dots is relative to number of reads. Anellovirus (TTV) sequences were also detected. The Virosaurus hierarchy allows allocating reads to viral entities: at the level of virus (HIV-1, HHV-8, MastV-6) or higher (TTV). Unclassified SEN viruses are TTV-like genomes. Mamastrovirus (Novel) is a subtyping, allowing differentiating between novel (i.e., MLB and VA/HMO) and classical human astroviruses.

References

    1. Human Viruses Table—ViralZone Page. [(accessed on 9 July 2020)]; Available online: https://swissprot.sib.swiss/prime/678.
    1. Diagnostic Methods in Virology, Virological Methods, Virus Culture, Virus Isolation. [(accessed on 9 July 2020)]; Available online: http://virology-online.com/general/Test1.htm.
    1. Happi A.N., Happi C.T., Schoepp R.J. Lassa fever diagnostics: Past, present, and future. Curr. Opin. Virol. 2019;37:132–138. doi: 10.1016/j.coviro.2019.08.002. - DOI - PMC - PubMed
    1. Lipkin W.I., Firth C. Viral surveillance and discovery. Curr. Opin. Virol. 2013;3:199–204. doi: 10.1016/j.coviro.2013.03.010. - DOI - PMC - PubMed
    1. Lipkin W.I., Anthony S.J. Virus hunting. Virology. 2015;479–480:194–199. doi: 10.1016/j.virol.2015.02.006. - DOI - PubMed

Publication types

LinkOut - more resources