Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jan 21:2024.01.20.576352.
doi: 10.1101/2024.01.20.576352.

Viroid-like colonists of human microbiomes

Affiliations

Viroid-like colonists of human microbiomes

Ivan N Zheludev et al. bioRxiv. .

Update in

  • Viroid-like colonists of human microbiomes.
    Zheludev IN, Edgar RC, Lopez-Galiano MJ, de la Peña M, Babaian A, Bhatt AS, Fire AZ. Zheludev IN, et al. Cell. 2024 Nov 14;187(23):6521-6536.e18. doi: 10.1016/j.cell.2024.09.033. Epub 2024 Oct 30. Cell. 2024. PMID: 39481381

Abstract

Here, we describe the "Obelisks," a previously unrecognised class of viroid-like elements that we first identified in human gut metatranscriptomic data. "Obelisks" share several properties: (i) apparently circular RNA ~1kb genome assemblies, (ii) predicted rod-like secondary structures encompassing the entire genome, and (iii) open reading frames coding for a novel protein superfamily, which we call the "Oblins". We find that Obelisks form their own distinct phylogenetic group with no detectable sequence or structural similarity to known biological agents. Further, Obelisks are prevalent in tested human microbiome metatranscriptomes with representatives detected in ~7% of analysed stool metatranscriptomes (29/440) and in ~50% of analysed oral metatranscriptomes (17/32). Obelisk compositions appear to differ between the anatomic sites and are capable of persisting in individuals, with continued presence over >300 days observed in one case. Large scale searches identified 29,959 Obelisks (clustered at 90% nucleotide identity), with examples from all seven continents and in diverse ecological niches. From this search, a subset of Obelisks are identified to code for Obelisk-specific variants of the hammerhead type-III self-cleaving ribozyme. Lastly, we identified one case of a bacterial species (Streptococcus sanguinis) in which a subset of defined laboratory strains harboured a specific Obelisk RNA population. As such, Obelisks comprise a class of diverse RNAs that have colonised, and gone unnoticed in, human, and global microbiomes.

Keywords: Hepatitis Delta Virus; Human Microbiome; RNA metaviromics; Viroid.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Obelisk alpha has a predicted extensive secondary structure and appears to colonise and speciate within the human gut
a) overview of the iterative approach taken in Obelisk discovery, (see methods) b) schematic of the predicted sense consensus secondary structure derived from all non-redundant, 1164 nt Obelisk-αs found using SRA-scale k-mer matching (PebbleScout). Predicted open reading frames (ORFs) 1 and 2 (green/yellow), and Shine-Delgarno sequences (purple) shown, c) “jupiter” plot of Obelisk-α coloured as in b), chords illustrate predicted basepairs (basepair probabilities grey, 0.1, to red, 1.0) d) Obelisk-α relative read abundance for six donors (A-G); sequence data from in Lloyd-Price et al., 2019 and time in days from first sample. e) Principal component analysis of sequence variation seen in Obelisk-α reads in Lloyd-Price et al., 2019 (the initial iHMP dataset), grouped by k-means clustering with 5 centres, coloured as in d).
Figure 2.
Figure 2.. Obelisks encode putatively well-folded proteins
a) Obelisk open reading frame 1 (Oblin-1) is predicted (total mean-pLDDT ± SD = 83.8 ± 13.4, see methods) to fold into a stereotyped N-terminal “globule” formed of a three alpha helix (orange) bundle partially wrapping around an orthogonal four helix bundle, capped with a beta sheet “clasp” (blue, globule mean-pLDDT = 90.1 ± 8.7), joined by an intervening region harbouring the conserved domain-A (magenta) with no predicted tertiary structure, to an arbitrarily placed C-terminal alpha helix. “Globule” emphasised on the right. b) a to-scale (secondary structure) topological representation of Oblin-1 with the “globule” shaded in grey, and the domain-A emphasised with this bit-score sequence logo (see methods). c) Obelisk Oblin-2 is confidently predicted (mean-pLDDT = 97.1 ± 4.6 ) to fold into an alpha helix which appears to be a leucine zipper. Sequence logo of an “i+7” leucine spacing emphasised in red, with hydrophobic “d” position residues emphasised in yellow (expanded in Supplementary Figure 4b). d) homo-multimer predictions of Obelisk-alpha Oblin-2. top: dimer (mean-pLDDT = 94.6 ± 0.6), bottom: trimer (mean-pLDDT = 93.6 ± 0.6). Side-on representations of homomultimers shown with numbers of inter-helix salt-bridges (see Supplementary Figure 5).
Figure 3.
Figure 3.. Obelisks form their own globally distributed phylogenetic group
a) a maximum likelihood, midpoint-rooted, phylogenetic tree (see methods) constructed from a non-redundant set of 3265 Serratus and RDVA domain-A sequences, with RDVA genomes positive for Obelisk-variant self cleaving Hammerhead Type III ribozymes illustrated as orange circles on leaves, and the top four known classes of SRA “host” metadata depicted as the colour band (see legend), and with per-RDVA-genome co-occurrence of Oblin-2 (based on blastp hits against the Oblin-2 consensus) illustrated as the outer ring (black studs). Leaves that correspond to domain-A sequences from Figure 4 are illustrated with stars. b) Counts of non de-replicated SRA datasets used to construct a) sorted by their “host” metadata; we note that “host” metadata likely fails to account other organisms’ genetic material that was sequences alongside the “host” (e.g. signals from these hosts’ microbiomes maybe be detected in tandem). c) Counts of non de-replicated SRA datasets used to construct a) arranged by sample geolocation (where known) illustrated on a world map (darker orange = more SRA datasets contributed to a)). We note that SRA counts are not expected to correlate with true geo-/ecological prevalence, but are still indicative of global presence.
Figure 4.
Figure 4.. Obelisks form a self-consistent set
Predicted Obelisk secondary structures depicted as “jupiter” plots where chords represent predicted basepairs (coloured by basepair probability from 0, grey, to 1, red, see methods) with predicted open reading frames (ORFs, preceded by predicted Shine-Delgarno sequences, purple) depicted: Oblin-1 (green), Oblin-2 (yellow, based on blastp hits against the Oblin-2 consensus), and “2ndORF” (orange). Obelisk-ɣ’s suggested CRISPR spacer match illustrated in light blue. ColabFold predictions of Oblin-1 tertiary “globule” structures built with ad hoc multiple sequence alignment (MSA) construction (coloured cartoons) superimposed over the RDVA-derived MSA prediction for Obelisk-α where possible (black line, Figure 2a, see methods). Prediction confidence (pLDDT) shown as cartoon colouring as in Supplementary Figure 3. Greek letter key: α : alpha, β : beta, ɣ : gamma, δ : delta, ε : epsilon, ζ : zeta, η : eta, θ : theta, ι: iota, κ : kappa, λ : lambda, μ : mu, ν : nu, and ξ : xi.

References

    1. Shi M., Lin X.-D., Tian J.-H., Chen L.-J., Chen X., Li C.-X., Qin X.-C., Li J., Cao J.-P., Eden J.-S., et al. (2016). Redefining the invertebrate RNA virosphere. Nature 540, 539–543. 10.1038/nature20167. - DOI - PubMed
    1. Edgar R.C., Taylor J., Lin V., Altman T., Barbera P., Meleshko D., Lohr D., Novakovsky G., Buchfink B., Al-Shayeb B., et al. (2022). Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147. 10.1038/s41586-021-04332-2. - DOI - PubMed
    1. Zayed A.A., Wainaina J.M., Dominguez-Huerta G., Pelletier E., Guo J., Mohssen M., Tian F., Pratama A.A., Bolduc B., Zablocki O., et al. (2022). Cryptic and abundant marine viruses at the evolutionary origins of Earth’s RNA virome. Science 376, 156–162. 10.1126/science.abm5847. - DOI - PMC - PubMed
    1. Neri U., Wolf Y.I., Roux S., Camargo A.P., Lee B., Kazlauskas D., Chen I.M., Ivanova N., Zeigler Allen L., Paez-Espino D., et al. (2022). Expansion of the global RNA virome reveals diverse clades of bacteriophages. Cell 185, 4023–4037.e18. 10.1016/j.cell.2022.08.023. - DOI - PubMed
    1. Olendraite I., Brown K., and Firth A.E. (2023). Identification of RNA Virus–Derived RdRp Sequences in Publicly Available Transcriptomic Data Sets. Molecular Biology and Evolution 40, msad060. 10.1093/molbev/msad060. - DOI - PMC - PubMed

Publication types

LinkOut - more resources