Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Feb 27:2023.02.26.530123.
doi: 10.1101/2023.02.26.530123.

Metagenome diversity illuminates origins of pathogen effectors

Affiliations

Metagenome diversity illuminates origins of pathogen effectors

Victoria I Verhoeve et al. bioRxiv. .

Update in

Abstract

Recent metagenome assembled genome (MAG) analyses have profoundly impacted Rickettsiology systematics. Discovery of basal lineages (Mitibacteraceae and Athabascaceae) with predicted extracellular lifestyles reveals an evolutionary timepoint for the transition to host dependency, which occurred independent of mitochondrial evolution. Notably, these basal rickettsiae carry the Rickettsiales vir homolog (rvh) type IV secretion system (T4SS) and purportedly use rvh to kill congener microbes rather than parasitize host cells as described for derived rickettsial pathogens. MAG analysis also substantially increased diversity for genus Rickettsia and delineated a basal lineage (Tisiphia) that stands to inform on the rise of human pathogens from protist and invertebrate endosymbionts. Herein, we probed Rickettsiales MAG and genomic diversity for the distribution of Rickettsia rvh effectors to ascertain their origins. A sparse distribution of most Rickettsia rvh effectors outside of Rickettsiaceae lineages indicates unique rvh evolution from basal extracellular species and other rickettsial families. Remarkably, nearly every effector was found in multiple divergent forms with variable architectures, illuminating profound roles for gene duplication and recombination in shaping effector repertoires in Rickettsia pathogens. Lateral gene transfer plays a prominent role shaping the rvh effector landscape, as evinced by the discover of many effectors on plasmids and conjugative transposons, as well as pervasive effector gene exchange between Rickettsia and Legionella species. Our study exemplifies how MAGs can provide incredible insight on the origins of pathogen effectors and how their architectural modifications become tailored to eukaryotic host cell biology.

Keywords: Rickettsia; effector; evolution; metagenome; type IV secretion system.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.. Probing Rickettsiales diversity for the evolution of Rickettsia type IV secretion system effectors.
(A) The atypical Rickettsiales vir homolog (rvh) T4SS is a hallmark of Rickettsiales, arising before the origin of host dependency ,. Schema depicts recent genome-based phylogeny estimation . Rvh characteristics , are described at bottom and further in FIG. S1. (B) List of Rickettsia rvh effector molecules (REMs) and candidate REMs (cREMs). GBA, guilty by association. ‘Implied’ means analogous proteins are known to be secreted by other bacteria and/or the effector has strongly predicted eukaryotic molecular targets. Secretion, coimmunoprecipitation (CoIP), and bacterial 2-hybrid (B2H) data are compiled from prior reports –,–. SWA, Schuenke Walker Antigen domain. (C) Phylogenomic analysis of Rickettsia REMs and cREMs in non-Rickettsiaceae lineages. Cladogram summarizes a phylogeny estimated from concatenated alignments for RvhB4-I and RvhB4-II proteins from 153 rickettsial assemblies (full tree, FIG. S2; sequence information, Table S1). Non-Rickettsiaceae lineages are shown (see FIG. 2 for Rickettsiaceae). Dashed box for Pat1 proteins indicates the inability to discern Pat1a and Pat1b homology outside of Tisiphia and Rickettsia species (see FIG. 2; FIG. 4). SWAMP, SWA modular proteins. Information for all REMs and cREMs is provided in Table S2.
FIGURE 2.
FIGURE 2.. Phylogenomic analysis of Rickettsia REMs and cREMs in Rickettsiaceae.
Cladogram (continued from FIG. 1C) summarizes a phylogeny estimated from concatenated alignments for RvhB4-I and RvhB4-II proteins from 153 rickettsial assemblies (full tree, FIG. S2; sequence information, Table S1). “Candidatus Sneabacter namystus” (highlighted yellow) was manually added to the cladogram based on prior phylogeny estimation , as this species lacks rvh genes but carries a type VI secretion system T6SS (see FIG. S3). Black boxes provide short names for 29 MAGs from Davison et al. (NOTE: the clade colored green comprises genus Tisiphia though genus name Rickettsia reflects NCBI taxonomy as of Feb. 26th, 2023). Asterisks depict multiple genome assemblies for a species. BG, Bellii Group; TRG, Transitional Group; TG, Typhus Group; TIG, Tamurae-Ixodes Group; SFG, Spotted Fever Group. Dashed box for Pat1 proteins indicates the inability to discern Pat1a and Pat1b homology outside of Tisiphia and Rickettsia species (see FIG. 4). Yellow boxes denote Risk2 proteins that are appended to C-terminal Schuenke-Walker antigen (SWA) domains (see FIG. 5). SWAMP, SWA modular proteins; all other REMs and cREMs are described in FIG. 1B and Table S2). Half circles for rCRCT-3a depict the presence of one or more antidotes but no toxin.
FIGURE 3.
FIGURE 3.. MAG analysis divulges a greater diversity of Sec7-domain-containing proteins than previously appreciated.
Black boxes provide short names for MAGs from Davison et al. . These and additional newly discovered RalF-like proteins (highlighted yellow) substantially expand the prior RalF diversity. Structural models for proteins are found in FIG. S4D–F. (A,B) Novel insight from new (A) Legionella and rickettsial architectures and (B) new RalF-like proteins discovered in MAGs. Red shading and numbers indicate % aa identity across pairwise alignments (sequence information in Table S2). All protein domains are described in the gray inset. (C) Comparison of the Legionella pneumophila RalF structure (PDB 4C7P) with predicted structures of S7D-SCD regions of RiCimp RalF (LF885_07310) and R. typhi RalF (RT0362), and S7Ds of Mycobacteriaceae sp. co.spades.METABAT.1kb_110 (K2X97_15435) and Proteobacterium SZAS-39 (JSR17_09325). The delineation of the Sec7 domain (S7D, red) and Sec7-capping domain (SCD, green if present), is shown with an approximation of the active site Glu (asterisk). Additional eukaryotic-like domains for the non-rickettsial proteins are noted. Modeling done with Phyre2 . More detailed structural explanation is found in FIG. S4C. (E) RiCimp plasmid pRiCimp001 carries RalF and a TA module similar to those characterized in reproductive parasitism . Gene region drawn to scale using PATRIC compare region viewer tool . Yellow, transposases and other mobile elements; skull-and-crossbones, pseudogenes; other domains described in gray inset at bottom. Plasmid map created with Proksee (https://proksee.ca/).
FIGURE 4.
FIGURE 4.. Divergent patatin phospholipases are recurrent in rickettsial evolution.
Black boxes provide short names for MAGs from Davison et al. . (A) PLA2 active site characteristics and divergent patatin forms. Green inset describes general patatin domain and active site architecture. HaloBlast results for Pat1A, Pat1B, and Pat2 (query sequences described at top) are shown, with top-scoring halos boxed (full results in TABLE S4). (B) Sequence logo showing conservation of the PLA2 active site motifs across Tisiphia and Rickettsia patatins (sequence information provided in Table S2). Pat1A, Pat1B, and Pat2 sequences were aligned separately with MUSCLE (default parameters) with active site motifs compiled for conservation assessment. Features unique to each patatin are noted. (C) Legionella pneumophila VipD structure (PDBID: 4AKF) and modeling of three rickettsial patatins to VipD using Phyre2 . (D) Diverse architectures for select patatins. Red shading and numbers indicate % aa identity across pairwise alignments (sequence information in Table S2). All protein domains are described in the gray inset. Dark green indicates Pat1 domains not grouped into A or B. (E) Four Rickettsia plasmids carry pat1B (shaded orange). Plasmid maps created with Proksee (https://proksee.ca/).
FIGURE 5.
FIGURE 5.. Discovery of a novel rickettsial PI kinase that can associate with a widespread rickettsial surface antigen.
Black boxes provide short names for MAGs from Davison et al. . Amino acid coloring is described in the FIG. 3 legend. (A) Previous work (above dashed line) identified a Rickettsia PI kinase, Risk1, with a cryptic active site similar to human and other bacterial PI3/PI4 kinases and related protein kinases. Colored shapes depict characterized substrate specificity (see panel B). Our study (below dashed line; select proteins shown) identified new rickettsial Risk1 proteins, as well as a second PI kinase (Risk2) also prevalent in rickettsial genomes and MAGs (FIG. 1C; FIG. 2). All PI3/PI4 kinase domains were aligned using MUSCLE (default parameters). Sequence information is provided in Table S2. Yellow highlighting on end coordinates denotes Risk2 proteins fused to a C-terminal SWA domain (see panel G; FIG. S5). (B) Mechanisms of phosphorylation on the PI inositol ring at 3’, 4’ and 5’ positions. Data for R. typhi Risk1 is superimposed in red . (C) HaloBlast results (R. typhi Risk1 and Risk2 as queries) broken down to illustrate the presence of Rickettsia-like PI kinases in MAGs and the similarity between Risk2 and Legionella PI kinases (full data in Table S2). (D) Risk1 threads with high confidence (90.7%, 72% coverage; Phyre2 ) to the Helicobacter pylori proinflammatory kinase CtkA . (E) Risk2 threads with high confidence (85.1%, 9% coverage; Phyre2 ) to a limited region of LepB, a Rab GTPase-activating protein effector from L. pneumophila . (F) Risk1 and Risk2 proteins have cryptic and distinct PI3/4 active sites, yet lack similarity outside of these regions. Logos depict individual alignments, which are summarized at top and were performed with MUSCLE , default settings. (G) Rickettsiae utilize a diverse arsenal of PIK effectors, some of which are tethered to SWA domains. Six select species are shown with their full complement of PI3/4 kinase and SWA architectures. Red shading and numbers indicate % aa identity across pairwise alignments (sequence information in Table S2). R. peacockii pRRP proteins are shaded orange (see panel H). (H) Plasmid pRRP of R. peacockii str. Rustic , carries a divergent Risk2 gene that is adjacent to an ORF encoding a SWAMP (orange highlighting). Plasmid map created with Proksee (https://proksee.ca/).
FIGURE 6.
FIGURE 6.. RARP-2 architecture is derived from multiple divergent forms.
Black boxes provide short names for MAGs from Davison et al. . Amino acid coloring is described in the FIG. 3 legend. Sequence logos constructed with WebLogo 3 . Sequence information in Table S2. (A) General architecture of RARP-2 proteins deduced from an alignment of 53 non-redundant RARP-2 proteins using MUSCLE (default parameters). (B) Consensus sequence for the RARP-2 ANK repeat deduced from 404 repeats. (C) Depiction of the 53 non-redundant RARP-2 proteins with ANK domain repeat number provided. For brevity, some strain names are not shown for R. prowazekii: Chernikova, Katsinyian, Dachau, BuV67-CWPP, Rp22; R. japonica: YH, DT-1, HH-1, HH06154, HH07124, HH07167, MZ08014, Nakase, PO-1, Tsuneishi, HH-13, HH06116, HH06125, LON-151, M11012, M14012, M14024, SR1567, YH_M; R. heilongjiangensis: HCN-13; Sendai-29; Sendai-58. (D) RARP-2 and dRARP-2 proteins possess N-terminal domain clan CD cysteine protease-like active sites . Sequences were manually aligned to illustrate the conservation across all diverse protein groups. ‘Rick. endo. UWC8’, endosymbiont of Acanthamoeba str. UWC8 (not shown in FIG. 1C but closely related to endosymbiont of Acanthamoeba str. UWC36 in the Midichloriaceae). (E) Insight on RARP-2/dRARP-2 structure. Asterisks indicate proteins from panel D that were used in Phyre2 searches to identify template structures for modeling . A complete structure of R. typhi RARP-2 predicted with Alphafold , corroborates these dRARP-2 models and indicates deviations on a common effector architecture (FIG. S6C).
FIGURE 7.
FIGURE 7.. Gene fission and duplication has shaped the architectures of four candidate REMs.
Black boxes provide short names for 29 MAGs from Davison et al. . Gene regions drawn to scale using PATRIC compare region viewer tool . Sequence information is provided in Table S2. (A) Similarity between O. tsutsugamushi effector OtDUB (CAM80065), divergent cREM-1 (cREM-1d), and cREM-1 proteins. OtDUB characterized domains: deubiquitinase (light blue), ubiquitin-binding (pink), cryptic Rac 1-like guanine nucleotide exchange factor (yellow), clathrin adaptor-protein complexes AP-1 and AP-2 (orange), phosphatidylserine-binding (gray) ,,. Red shading and numbers indicate % aa identity across pairwise alignments. (B) Cladogram depicting phylogeny estimation of 102 cREM-1d and cREM-1 proteins (see FIG. S7A for phylogram and methods). Inset describes classification of cREM-1 proteins, with clade colors matching the protein colors in the schema in panel A. (C) Diversification of cREM-2 proteins via duplication. (D) Cladogram depicting phylogeny estimation of 158 cREM-2d, cREM-2a, and cREM-2b proteins (see FIG. S7B for phylogram and methods). Inset describes classification of cREM-2 proteins, with clade colors matching the protein colors in the schema in panel C. (E) Ancient gene duplication of cREM-4 and location of cREM-4 genes in select Rickettsiales species. The cREM-4 pentapeptide repeat domain is illustrated in FIG. S7D. (F) cREM-5/5p loci occur in variable genomic regions, including plasmids and RAGEs. The complete RAGE for RiCle is illustrated in FIG. S7F. (G) Genomic distribution of ProP genes in Rickettsiales genomes. OP, Rickettsia endosymbionts of Oxypoda opaca (Oopac6) and Pyrocoelia pectoralis (Ppec13); OACHA, Rickettsia endosymbionts of Omalisus fontisbellaquei (Ofont3) and Adalia bipunctata, R. canadensis, R. helvetica, and R. asiatica.
FIGURE 8.
FIGURE 8.. MAG analyses lend insight on Rickettsia interactions with host actin cytoskeleton and rvh T4SS function.
Black boxes provide short names for 29 MAGs from Davison et al. . (A) MAGs shed light on the evolution of Rickettsia factors behind host actin polymerization and invasion. Tisiphia and BG rickettsiae taxa, as well as SWAMP, SWA-VBD, RickA, and RalF info, is from FIG 2. The passenger domains of R. conorii Sca2 (AAL02648), R. typhi dSca2 (AAU03539), R. bellii Sca2–6 (ABE05361), R. conorii Sca0 (AAL03811), R. typhi Sca3 (AAU03915), R. typhi Sca1 (AAU03504) and R. typhi Sca5 (AAU04158) were used in BlastP searches directly against Tisiphia and BG rickettsiae genomes. Passenger domains and linker sequences were delineated as previously shown . ReBt, Rickettsia spp. MEAM1, wb, and wq (endosymbionts of Bemisia tabaci). (B) Some Rickettsia genomes encode one or more host actin nucleation proteins. Top: Tisiphia and Rickettsia RickA proteins share a large N-terminal repeat domain but diverge at their C-termini. WH2, Wiskott-Aldrich syndrome protein homology 2 domain. Further details on RickA architecture are provided in FIG. S8. Bottom: Sca2, d-Sca2 and Sca2–6 proteins have a common autotransporter domain (β) but divergent passenger domains. FH2, formin homology 2. Sca2 mimics host formin actin nucleators to recruit and polymerize actin for intracellular motility and intercellular spread ,. The functions of dSca2 and Sca2/6 are unknown. (C) The mobile nature of CDI-like/Rhs-like C-terminal toxin/antidote (CRCT/CRCA) modules across diverse rickettsial genomes. Schema shows integration of CRCT/CRCA modules into larger polymorphic toxins (hemagluttinin-like toxins, LysM-like peptidoglycan/chitin-targeting toxins, etc.), as well as CRCT/CRCA modules independent of larger toxins. The toxin warhead for RZI45292 is unknown. Further details are provided in FIG. S9.

References

    1. Salje J. Cells within cells: Rickettsiales and the obligate intracellular bacterial lifestyle. Nat. Rev. Microbiol. 2021 196 19, 375–390 (2021). - PubMed
    1. Sahni A., Fang R., Sahni S. K. & Walker D. H. Pathogenesis of Rickettsial Diseases: Pathogenic and Immune Mechanisms of an Endotheliotropic Infection. Annu. Rev. Pathol. Mech. Dis. 14, 421058251 (2019). - PMC - PubMed
    1. Narra H. P., Sahni A., Walker D. H. & Sahni S. K. Recent research milestones in the pathogenesis of human rickettsioses and opportunities ahead. Future Microbiology 15, 753–765 (2020). - PMC - PubMed
    1. Werren J. H., Baldo L. & Clark M. E. Wolbachia: master manipulators of invertebrate biology. Nat. Rev. Microbiol. 6, 741–51 (2008). - PubMed
    1. Gillespie J. J. et al. A Tangled Web: Origins of Reproductive Parasitism. Genome Biol. Evol. 10, 2292–2309 (2018). - PMC - PubMed

Publication types