The protein structurome of Orthornavirae and its dark matter
- PMID: 39714180
- PMCID: PMC11796362
- DOI: 10.1128/mbio.03200-24
The protein structurome of Orthornavirae and its dark matter
Abstract
Metatranscriptomics is uncovering more and more diverse families of viruses with RNA genomes comprising the viral kingdom Orthornavirae in the realm Riboviria. Thorough protein annotation and comparison are essential to get insights into the functions of viral proteins and virus evolution. In addition to sequence- and hmm profile‑based methods, protein structure comparison adds a powerful tool to uncover protein functions and relationships. We constructed an Orthornavirae "structurome" consisting of already annotated as well as unannotated ("dark matter") proteins and domains encoded in viral genomes. We used protein structure modeling and similarity searches to illuminate the remaining dark matter in hundreds of thousands of orthornavirus genomes. The vast majority of the dark matter domains showed either "generic" folds, such as single α-helices, or no high confidence structure predictions. Nevertheless, a variety of lineage-specific globular domains that were new either to orthornaviruses in general or to particular virus families were identified within the proteomic dark matter of orthornaviruses, including several predicted nucleic acid-binding domains and nucleases. In addition, we identified a case of exaptation of a cellular nucleoside monophosphate kinase as an RNA-binding protein in several virus families. Notwithstanding the continuing discovery of numerous orthornaviruses, it appears that all the protein domains conserved in large groups of viruses have already been identified. The rest of the viral proteome seems to be dominated by poorly structured domains including intrinsically disordered ones that likely mediate specific virus-host interactions.
Importance: Advanced methods for protein structure prediction, such as AlphaFold2, greatly expand our capability to identify protein domains and infer their likely functions and evolutionary relationships. This is particularly pertinent for proteins encoded by viruses that are known to evolve rapidly and as a result often cannot be adequately characterized by analysis of the protein sequences. We performed an exhaustive structure prediction and comparative analysis for uncharacterized proteins and domains ("dark matter") encoded by viruses with RNA genomes. The results show the dark matter of RNA virus proteome consists mostly of disordered and all-α-helical domains that cannot be readily assigned a specific function and that likely mediate various interactions between viral proteins and between viral and host proteins. The great majority of globular proteins and domains of RNA viruses are already known although we identified several unexpected domains represented in individual viral families.
Keywords: Orthornaviria; RNA virus; novel protein domains; protein structure prediction; proteome.
Conflict of interest statement
The authors declare no conflict of interest.
Figures








Similar articles
-
Exaptation of Inactivated Host Enzymes for Structural Roles in Orthopoxviruses and Novel Folds of Virus Proteins Revealed by Protein Structure Modeling.mBio. 2023 Apr 25;14(2):e0040823. doi: 10.1128/mbio.00408-23. Epub 2023 Apr 5. mBio. 2023. PMID: 37017580 Free PMC article.
-
Novel Immunoglobulin Domain Proteins Provide Insights into Evolution and Pathogenesis of SARS-CoV-2-Related Viruses.mBio. 2020 May 29;11(3):e00760-20. doi: 10.1128/mBio.00760-20. mBio. 2020. PMID: 32471829 Free PMC article.
-
A Divergent Articulavirus in an Australian Gecko Identified Using Meta-Transcriptomics and Protein Structure Comparisons.Viruses. 2020 Jun 4;12(6):613. doi: 10.3390/v12060613. Viruses. 2020. PMID: 32512909 Free PMC article.
-
[The great virus comeback].Biol Aujourdhui. 2013;207(3):153-68. doi: 10.1051/jbio/2013018. Epub 2013 Dec 13. Biol Aujourdhui. 2013. PMID: 24330969 Review. French.
-
Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences.Crit Rev Biochem Mol Biol. 1993;28(5):375-430. doi: 10.3109/10409239309078440. Crit Rev Biochem Mol Biol. 1993. PMID: 8269709 Review.
Cited by
-
How nidoviruses evolved the largest known RNA genomes.Proc Natl Acad Sci U S A. 2025 Mar 18;122(11):e2501153122. doi: 10.1073/pnas.2501153122. Epub 2025 Mar 10. Proc Natl Acad Sci U S A. 2025. PMID: 40063816 Free PMC article. No abstract available.
References
-
- Edgar RC, Taylor B, Lin V, Altman T, Barbera P, Meleshko D, Lohr D, Novakovsky G, Buchfink B, Al-Shayeb B, Banfield JF, de la Peña M, Korobeynikov A, Chikhi R, Babaian A. 2022. Petabase-scale sequence alignment catalyses viral discovery. Nature New Biol 602:142–147. doi:10.1038/s41586-021-04332-2 - DOI - PubMed
-
- Bukhari K, Mulley G, Gulyaeva AA, Zhao L, Shu G, Jiang J, Neuman BW. 2018. Description and initial characterization of metatranscriptomic nidovirus-like genomes from the proposed new family abyssoviridae, and from a sister group to the coronavirinae, the proposed genus alphaletovirus. Virology (Auckl) 524:160–171. doi:10.1016/j.virol.2018.08.010 - DOI - PMC - PubMed