Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 14;187(23):6521-6536.e18.
doi: 10.1016/j.cell.2024.09.033. Epub 2024 Oct 30.

Viroid-like colonists of human microbiomes

Affiliations

Viroid-like colonists of human microbiomes

Ivan N Zheludev et al. Cell. .

Abstract

Here, we describe "obelisks," a class of heritable RNA elements sharing several properties: (1) apparently circular RNA ∼1 kb genome assemblies, (2) predicted rod-like genome-wide secondary structures, and (3) open reading frames encoding a novel "Oblin" protein superfamily. A subset of obelisks includes a variant hammerhead self-cleaving ribozyme. Obelisks form their own phylogenetic group without detectable similarity to known biological agents. Surveying globally, we identified 29,959 distinct obelisks (clustered at 90% sequence identity) from diverse ecological niches. Obelisks are prevalent in human microbiomes, with detection in ∼7% (29/440) and ∼50% (17/32) of queried stool and oral metatranscriptomes, respectively. We establish Streptococcus sanguinis as a cellular host of a specific obelisk and find that this obelisk's maintenance is not essential for bacterial growth. Our observations identify obelisks as a class of diverse RNAs of yet-to-be-determined impact that have colonized and gone unnoticed in human and global microbiomes.

Keywords: RNA metaviromics; hepatitis delta virus; human microbiome; viroid.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Obelisk alpha has a predicted extensive secondary structure and appears to colonise and speciate within the human gut
a) overview of the iterative approach taken in Obelisk discovery, (see methods) b) schematic of the predicted sense consensus secondary structure derived from all non-redundant, 1164 nt Obelisk-αs found using SRA-scale k-mer matching (PebbleScout). Predicted open reading frames (ORFs) 1 and 2 (green/yellow), and Shine-Dalgarno sequences (purple) shown, c) “jupiter” plot of Obelisk-α coloured as in b), chords illustrate predicted basepairs (basepair probabilities grey, 0.1, to red, 1.0) d) Obelisk-α relative read abundance for six donors (A-G); sequence data from in Lloyd-Price et al., 2019 and time in days from first sample. e) Principal component analysis of sequence variation seen in Obelisk-α reads in Lloyd-Price et al., 2019 (the initial iHMP dataset), grouped by k-means clustering with 5 centres, coloured as in d).
Figure 2.
Figure 2.. Obelisks encode putatively well-folded proteins
a) Obelisk open reading frame 1 (Oblin-1) is predicted (total mean-pLDDT ± SD = 83.8 ± 13.4, see methods) to fold into a stereotyped N-terminal “globule” formed of a three alpha helix (orange) bundle partially wrapping around an orthogonal four helix bundle, capped with a beta sheet “clasp” (blue, globule mean-pLDDT = 90.1 ± 8.7), joined by an intervening region harbouring the conserved domain-A (magenta) with no predicted tertiary structure, to an arbitrarily placed C-terminal alpha helix. “Globule” emphasised on the right. b) a to-scale (secondary structure) topological representation of Oblin-1 with the “globule” shaded in grey, and the domain-A emphasised with this bit-score sequence logo (see methods). c) Obelisk Oblin-2 is confidently predicted (mean-pLDDT = 97.1 ± 4.6) to fold into an alpha helix which appears to be a leucine zipper. Sequence logo of an “i+7” leucine spacing emphasised in red, with hydrophobic “d” position residues emphasised in yellow (expanded in Supplementary Figure 4c). d) homo-multimer predictions of Obelisk-alpha Oblin-2. top: dimer (mean-pLDDT = 94.6 ± 0.6), bottom: trimer (mean-pLDDT = 93.6 ± 0.6). Side-on representations of homomultimers shown with numbers of inter-helix salt-bridges (see Supplementary Figure 4d).
Figure 3.
Figure 3.. Obelisks form a globally distributed phylogenetic group
a) a pairwise distance, neighbour joining, midpoint-rooted, dendrogram (branch lengths ignored, see methods) constructed from a non-redundant set of 1641 RDVA Oblin-1 sequences, with Obelisk-variant self cleaving Hammerhead Type III ribozymes illustrated as orange circles on leaves, and Obelisks possessing exactly two predicted open reading frames indicated with black triangles. Leaves that correspond to sequences from Figure 6 are illustrated with magenta circles and are coloured by their microbiome of discovery (red = gastric, blue = oral, black = unknown). b) Counts of filtered SRA datasets from Serratus and RDVA sorted by their “host” metadata (see methods). c) Datasets from b) arranged by sample geolocation (where known) illustrated on a world map (darker orange = more SRA datasets). We note that SRA counts are not expected to correlate with true geo-/ecological prevalence, but are still indicative of global presence.
Figure 4.
Figure 4.. Obelisk-S.s is dispensable for SK36 growth in replete conditions
a) Streptococcus sanguinis SK36 substrains positive (ObP1) and negative (ObN1) for Obelisk-S.s do not appear to grow discernibly differently in replete aerobic liquid culture (octuplet cultures of triplicate isolated substrains per ObP1/ObN1, brain heart infusion broth, 37 °C, see methods). Likewise, computed growth characteristics do not show discernible effects from loss of Obelisk-S.s either in lag time (b, mean ± SD, 5.7 ± 0.37 hours for ObN1, and 5.7 ± 0.47 hours for ObP1), or in doubling time (c, mean ± SD, 47.2 ± 2.9 minutes for ObN1, and 48.4 ± 3.4 minutes for ObP1). d) short read sequencing (see methods) of triplicate cultures of ObN1 and ObP1 (red and blue, respectively) indicate that Obelisk-S.s is exclusively RNA (see also Supplemental Figure 3b), with the RNAand accountings for 0.6 ± 0.04 % of the total ObP1 transcriptome. e) Differential expression analysis indicates that under these growth conditions and statistical methods (see methods) that no transcripts other than Obelisk-S.s were significantly differentially expressed between ObP1 and ObN1 (blue = q-value ≤ 0.05). Links to analysis results for rRNA-depleted and RNaseR-treated data are available in the Key resources table.
Figure 5.
Figure 5.. Streptococcus sanguinis SK36 harbours Obelisk-S.s
a) “Jupiter” plot the Streptococcus sanguinis SK36 Obelisk-S.s (“Obelisk_000003” in Supplementary Table 2) illustrated as in Figure 1c, chords illustrate predicted basepairs (basepair probabilities grey, 0.1, to red, 1.0) with the addition of annotations for primer sites used in characterization (outer track, provided in the Key resources table) and of coverage plots of data shown in b) (inner track). b) Distribution of total RNA sequence k-mers matching the Obelisk-S.s in the ObP1 RNA sequence data (see methods): Under these experimental and analytical conditions, Obelisk-S.s appears to be predominantly antisense (relative to Oblin-1), with 93.6 ± 1.2 % k-mers mapping to the antisense strand. Links to analysis results for rRNA-depleted and RNaseR-treated data are available in the Key resources table.
Figure 6.
Figure 6.. Obelisks form a self-consistent set
Predicted Obelisk secondary structures depicted as “jupiter” plots where chords represent predicted basepairs (coloured by basepair probability from 0, grey, to 1, red, see methods) with predicted open reading frames (ORFs, preceded by predicted Shine-Dalgarno sequences, purple) depicted: Oblin-1 (green), Oblin-2 (yellow, based on blastp hits against the Oblin-2 consensus), and “2ndORF” (orange). Obelisk-γ’s suggested CRISPR spacer match illustrated in light blue. ColabFold predictions of Oblin-1 tertiary “globule” structures built with ad hoc multiple sequence alignment (MSA) construction (coloured cartoons) superimposed over the RDVA-derived MSA prediction for Obelisk-α where possible (black line, Figure 4a, see methods). Prediction confidence (pLDDT) shown as cartoon colouring as in Supplementary Figure 2. Greek letter key: α : alpha, β : beta, γ : gamma, δ : delta, ε : epsilon, ζ : zeta, η : eta, θ : theta, ι : iota, κ : kappa, λ : lambda, μ : mu, ν : nu, and ξ : xi.

Update of

References

    1. Shi M, Lin X-D, Tian J-H, Chen L-J, Chen X, Li C-X, Qin X-C, Li J, Cao J-P, Eden J-S, et al. (2016). Redefining the invertebrate RNA virosphere. Nature 540, 539–543. 10.1038/nature20167. - DOI - PubMed
    1. Edgar RC, Taylor J, Lin V, Altman T, Barbera P, Meleshko D, Lohr D, Novakovsky G, Buchfink B, Al-Shayeb B, et al. (2022). Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147. 10.1038/s41586-021-04332-2. - DOI - PubMed
    1. Zayed AA, Wainaina JM, Dominguez-Huerta G, Pelletier E, Guo J, Mohssen M, Tian F, Pratama AA, Bolduc B, Zablocki O, et al. (2022). Cryptic and abundant marine viruses at the evolutionary origins of Earth’s RNA virome. Science 376, 156–162. 10.1126/science.abm5847. - DOI - PMC - PubMed
    1. Neri U, Wolf YI, Roux S, Camargo AP, Lee B, Kazlauskas D, Chen IM, Ivanova N, Zeigler Allen L, Paez-Espino D, et al. (2022). Expansion of the global RNA virome reveals diverse clades of bacteriophages. Cell 185, 4023–4037.e18. 10.1016/j.cell.2022.08.023. - DOI - PubMed
    1. Olendraite I, Brown K, and Firth AE (2023). Identification of RNA Virus–Derived RdRp Sequences in Publicly Available Transcriptomic Data Sets. Mol. Biol. Evol 40, msad060. 10.1093/molbev/msad060. - DOI - PMC - PubMed

LinkOut - more resources