Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008;3(4):e2018.
doi: 10.1371/journal.pone.0002018. Epub 2008 Apr 16.

Rickettsia phylogenomics: unwinding the intricacies of obligate intracellular life

Affiliations

Rickettsia phylogenomics: unwinding the intricacies of obligate intracellular life

Joseph J Gillespie et al. PLoS One. 2008.

Abstract

Background: Completed genome sequences are rapidly increasing for Rickettsia, obligate intracellular alpha-proteobacteria responsible for various human diseases, including epidemic typhus and Rocky Mountain spotted fever. In light of phylogeny, the establishment of orthologous groups (OGs) of open reading frames (ORFs) will distinguish the core rickettsial genes and other group specific genes (class 1 OGs or C1OGs) from those distributed indiscriminately throughout the rickettsial tree (class 2 OG or C2OGs).

Methodology/principal findings: We present 1823 representative (no gene duplications) and 259 non-representative (at least one gene duplication) rickettsial OGs. While the highly reductive (approximately 1.2 MB) Rickettsia genomes range in predicted ORFs from 872 to 1512, a core of 752 OGs was identified, depicting the essential Rickettsia genes. Unsurprisingly, this core lacks many metabolic genes, reflecting the dependence on host resources for growth and survival. Additionally, we bolster our recent reclassification of Rickettsia by identifying OGs that define the AG (ancestral group), TG (typhus group), TRG (transitional group), and SFG (spotted fever group) rickettsiae. OGs for insect-associated species, tick-associated species and species that harbor plasmids were also predicted. Through superimposition of all OGs over robust phylogeny estimation, we discern between C1OGs and C2OGs, the latter depicting genes either decaying from the conserved C1OGs or acquired laterally. Finally, scrutiny of non-representative OGs revealed high levels of split genes versus gene duplications, with both phenomena confounding gene orthology assignment. Interestingly, non-representative OGs, as well as OGs comprised of several gene families typically involved in microbial pathogenicity and/or the acquisition of virulence factors, fall predominantly within C2OG distributions.

Conclusion/significance: Collectively, we determined the relative conservation and distribution of 14354 predicted ORFs from 10 rickettsial genomes across robust phylogeny estimation. The data, available at PATRIC (PathoSystems Resource Integration Center), provide novel information for unwinding the intricacies associated with Rickettsia pathogenesis, expanding the range of potential diagnostic, vaccine and therapeutic targets.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Venn diagram depicting 15 intersections for the four rickettsial groups.
Classification scheme based on molecular phylogeny estimation , the topology of which is shown in the lower left; AG = ancestral group, TG = typhus group, TRG = transitional group, SFG = spotted fever group. Genome codes are as follows: Br = R. bellii str. RML369-C, Bo = R. bellii str. OSU 85 389, Ca = R. canadensis str. McKiel, Pr = R. prowazekii str. Madrid E, Ty = R. typhi str. Wilmington, Ak = R. akari str. Hartford, Fe = R. felis str. URRWXCal2, Ri = R. rickettsii str. Sheila Smith CWPP, Co = R. conorii str. Malish 7, and Si = R. sibirica str. 246. Arthropod hosts are illustrated for each genome, and strains known to harbor plasmids are depicted.
Figure 2
Figure 2. Alignment of 10 rickettsial genomes.
Taxa are in the same position as in estimated trees in Figure 3, with taxon abbreviations explained in the Figure 1 legend. Alignment created using Mauve after reindexing the R. sibirica genome (see text for details).
Figure 3
Figure 3. Estimated phylogenies of ten rickettsial taxa based on 731 representative core proteins.
(A) Tree from Bayesian analysis. Three MCMC chains were primed with a neighbor-joining tree and run independently for 25000 generations in model-jumping mode. Burn-in was attained by 2500 generations for all chains, and a single tree topology with exclusive use of the Jones substitution model was observed in post burn-in data. The consensus tree shown here thus has 100% support for every branch. Branch support is from the distribution of posterior probabilities from all trees minus the burn-in. (B) Tree from exhaustive search using parsimony. Branch support is from one million bootstrap replicates.
Figure 4
Figure 4. Illustration of representative and non-representative OGs and their categorization into Class 1 and Class 2 OGs.
Taxon abbreviations are explained in the Figure 1 legend. Dark circles depict gene presence, while open circles depict gene absence. (A) Representative OGs: orthologous groups with only one ORF per included genome. Our analysis includes ten rickettsial genomes, thus representative OGs only include from 2–10 ORFs. Four examples are shown. (B) Non-representative OGs: orthologous groups with multiple ORFs from at least one included genome, comprised of either recent (orthologs) or distant (paralogs) gene duplications (dupl). False singleton OGs are comprised of only one taxon, but with multiple ORFs from that taxon (example on right). Four examples are shown. (C) Class 1 OGs (C1OGs): orthologous groups comprising single rickettsial groups (e.g., AG, TG, TRG, and SFG), shared rickettsial groups (subgeneric), plasmid-harboring genomes, and genomes with common arthropod hosts. Two representative (left) and two non-representative (right) C1OGs are shown. (D) Class 2 OGs (C2OGs): orthologous groups with patchy distribution across the rickettsial tree, depicting gene losses and/or genes acquired laterally. Two representative and two non-representative C2OGs are shown.
Figure 5
Figure 5. Comparison of the distributions of 1300 representative and 145 non-representative class 1 OGs (C1OGs), 66 false singletons, and 1467 singleton ORFs.
Slices depict 16 generic and subgeneric groups, false singletons, singletons, plasmid associated groups, and two host-related groups, with outer circle colors depicted in schema. Taxon abbreviations, including subgeneric groups, are explained in the Figure 1 legend. (A) Distribution of 1300 representative C1OGs and 1467 singletons. (B) Distribution of 79 non-representative C1OGs and 66 false singletons.
Figure 6
Figure 6. Manual curation of 259 non-representative OGs predicted by OrthoMCL.
Schema depicts 179 OGs repaired to representative after stitching together split ORFs (larger pie chart) and remaining true non-representative OGs defined by in-paralogs.
Figure 7
Figure 7. Distribution of representative and non-representative class 1 OGs (C1OGs) and singleton ORFs over estimated rickettsial phylogeny.
Boxes depict the distribution of phylogenetic groups, singletons, plasmid associated groups, and host-related groups: Red = AG rickettsiae, aquamarine = TG rickettsiae, blue = TRG rickettsiae, brown = SFG rickettsiae, gray = higher-level groupings, light green = R. bellii strains only. Orange boxes depict genes found on the pRF plasmid of R. felis str. URRWXCal2 and chromosomes R. felis and both R. bellii strains (as of this publication the R. bellii plasmids remain unavailable). Genes specific to single rickettsial genomes (singletons) are in yellow boxes, with taxon abbreviations explained in the Figure 1 legend. Host specific groups are defined by green (insect) and tan (tick) boxes. Genome statistics were compiled from the PATRIC and NCBI databases. Cladogram is based on trees shown in Figure 3. Inset in dashed box describes general schema for each box. *Total R. felis genome size: 1,485,148 bp = chromosome; 62,829 bp = pRF and 39,263 bp = pRFδ.
Figure 8
Figure 8. Bioinformatic analysis of core representative OGs.
(A) Assignment of 731 core representative RiOGs to predicted cellular function categories. Format follows that established at the COG database (NCBI) except for cf = combined function and rpe = rickettsial palindromic element. (B) Comparison of the distribution of cellular function categories across 731 core rickettsial OGs (Ri), a recent protein expression profile for R. felis (Rf), and COGs for three other bacteria: Escherichia coli (Ec), Yersinia pestis (Yp) and Chlamydia trachomatis (Ct). Inset at left shows the number of genes per genome for cellular function categories involved in organic and inorganic transport and metabolism (E, F, G, H, I, P, and Q) followed by the percentage these genes comprise of total protein-encoding genes. Results from a six-way regression analysis are shown in the right inset.
Figure 9
Figure 9. Phylogeny estimation of the ten analyzed rickettsial taxa plus R. helvetica and R. australis based on 16 proteins.
See Table S13 for gene names and sequence accession numbers. Tree estimated under parsimony (see text).
Figure 10
Figure 10. Analysis of the distribution of 1467 singleton ORFs omitted from OG prediction across 10 rickettsial genomes.
(A) Singleton ORFs across four rickettsial groups. (B) Singleton ORFs across 10 rickettsial genomes. First number is total number of singleton ORFs per taxon, with second number the total singleton ORFs annotated as HPs. Dashed lines in pie charts separate characterized proteins from HPs, with percentages given only for HPs. (C) Average lengths of singleton ORFs with predicted functions versus singleton ORFs annotated as HPs for all ten analyzed rickettsial genomes.

Similar articles

Cited by

References

    1. Weisburg WG, Dobson ME, Samuel JE, Dasch GA, Mallavia LP, et al. Phylogenetic diversity of the Rickettsiae. J Bacteriol. 1989;171:4202–4206. - PMC - PubMed
    1. Olsen GJ, Woese CR, Overbeek R. The winds of (evolutionary) change: breathing new life into microbiology. J Bacteriol. 1994;176:1–6. - PMC - PubMed
    1. Stothard DR, Fuerst PA. Evolutionary analysis of the spotted fever and typhus groups of Rickettsia using 16S rRNA gene sequences. Syst Appl Microbiol. 1995;18:52–61.
    1. Boone DR, Castenholz RW, Garrity GM. Bergey's manual of systematic bacteriology. New York, NY: Springer; 2001.
    1. Tamura A, Ohashi N, Urakami H, Miyamura S. Classification of Rickettsia tsutsugamushi in a new genus, Orientia gen. nov., as Orientia tsutsugamushi comb. nov. Int J Syst Bacteriol. 1995;45:589–591. - PubMed

Publication types

LinkOut - more resources