Opportunities and Challenges of Data-Driven Virus Discovery
- PMID: 36008967
- PMCID: PMC9406072
- DOI: 10.3390/biom12081073
Opportunities and Challenges of Data-Driven Virus Discovery
Abstract
Virus discovery has been fueled by new technologies ever since the first viruses were discovered at the end of the 19th century. Starting with mechanical devices that provided evidence for virus presence in sick hosts, virus discovery gradually transitioned into a sequence-based scientific discipline, which, nowadays, can characterize virus identity and explore viral diversity at an unprecedented resolution and depth. Sequencing technologies are now being used routinely and at ever-increasing scales, producing an avalanche of novel viral sequences found in a multitude of organisms and environments. In this perspective article, we argue that virus discovery has started to undergo another transformation prompted by the emergence of new approaches that are sequence data-centered and primarily computational, setting them apart from previous technology-driven innovations. The data-driven virus discovery approach is largely uncoupled from the collection and processing of biological samples, and exploits the availability of massive amounts of publicly and freely accessible data from sequencing archives. We discuss open challenges to be solved in order to unlock the full potential of data-driven virus discovery, and we highlight the benefits it can bring to classical (mostly molecular) virology and molecular biology in general.
Keywords: computational virology; data mining; sequencing archives; virosphere in health and disease; virus discovery.
Conflict of interest statement
The authors declare no conflict of interest.
Figures

Similar articles
-
Software Dedicated to Virus Sequence Analysis "Bioinformatics Goes Viral".Adv Virus Res. 2017;99:233-257. doi: 10.1016/bs.aivir.2017.08.004. Epub 2017 Sep 28. Adv Virus Res. 2017. PMID: 29029728 Free PMC article.
-
Current challenges to virus discovery by meta-transcriptomics.Curr Opin Virol. 2021 Dec;51:48-55. doi: 10.1016/j.coviro.2021.09.007. Epub 2021 Sep 27. Curr Opin Virol. 2021. PMID: 34592710 Review.
-
Advances in the application of high-throughput sequencing in invertebrate virology.J Invertebr Pathol. 2017 Jul;147:145-156. doi: 10.1016/j.jip.2017.02.006. Epub 2017 Feb 27. J Invertebr Pathol. 2017. PMID: 28249815 Review.
-
The First "Virus Hunters".Adv Virus Res. 2017;99:1-16. doi: 10.1016/bs.aivir.2017.07.005. Epub 2017 Sep 9. Adv Virus Res. 2017. PMID: 29029722
-
Next-generation sequencing in clinical virology: Discovery of new viruses.World J Virol. 2015 Aug 12;4(3):265-76. doi: 10.5501/wjv.v4.i3.265. World J Virol. 2015. PMID: 26279987 Free PMC article. Review.
Cited by
-
Identification of nine putative novel members of plant-infecting alphaflexiviruses in public domain plant transcriptomes.Virusdisease. 2024 Dec;35(4):630-636. doi: 10.1007/s13337-024-00898-3. Epub 2024 Oct 19. Virusdisease. 2024. PMID: 39677843
-
Unveiling the genetic diversity of the genera Enamovirus and Polerovirus through data-driven virus discovery.Arch Virol. 2025 Mar 13;170(4):76. doi: 10.1007/s00705-025-06258-w. Arch Virol. 2025. PMID: 40080166
-
Deep mining of the Sequence Read Archive reveals major genetic innovations in coronaviruses and other nidoviruses of aquatic vertebrates.PLoS Pathog. 2024 Apr 22;20(4):e1012163. doi: 10.1371/journal.ppat.1012163. eCollection 2024 Apr. PLoS Pathog. 2024. PMID: 38648214 Free PMC article.
-
Virus taxonomy and the role of the International Committee on Taxonomy of Viruses (ICTV).J Gen Virol. 2023 May;104(5):001840. doi: 10.1099/jgv.0.001840. J Gen Virol. 2023. PMID: 37141106 Free PMC article.
-
Unlocking the Hidden Genetic Diversity of Varicosaviruses, the Neglected Plant Rhabdoviruses.Pathogens. 2022 Sep 29;11(10):1127. doi: 10.3390/pathogens11101127. Pathogens. 2022. PMID: 36297184 Free PMC article.
References
-
- Ivanovsky D. Über Die Mosaikkrankheit Der Tabakspflanze. Bull. Acad. Imper. Sci. St. Petersburg. 1892;35:67–70.
-
- Beijerinck M.W. Über Ein Contagium Vivum Fluidum Als Ursache Der Fleckenkrankheit Der Tabaksblätter. Verh Kon Akad Wetensch. 1898;65:3–21.
-
- Chamberland C. A Filter Permitting to Obtain Physiologically Pure Water. Compt. Rend. Acad. Sci. 1884;99:247–248.
-
- Löffler F., Frosch P. Summarischer Bericht Über Die Ergebnisse Der Untersuchungen Der Commission Zur Erforschung Der Maul-Und Klauenseuche. Cent. Bakt. Parasit. 1898;23:371–391.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources