Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2000 Mar;68(3):1633-48.
doi: 10.1128/IAI.68.3.1633-1648.2000.

Comparative genome analysis of the pathogenic spirochetes Borrelia burgdorferi and Treponema pallidum

Affiliations
Comparative Study

Comparative genome analysis of the pathogenic spirochetes Borrelia burgdorferi and Treponema pallidum

G Subramanian et al. Infect Immun. 2000 Mar.

Abstract

A comparative analysis of the predicted protein sequences encoded in the complete genomes of Borrelia burgdorferi and Treponema pallidum provides a number of insights into evolutionary trends and adaptive strategies of the two spirochetes. A measure of orthologous relationships between gene sets, termed the orthology coefficient (OC), was developed. The overall OC value for the gene sets of the two spirochetes is about 0.43, which means that less than one-half of the genes show readily detectable orthologous relationships. This emphasizes significant divergence between the two spirochetes, apparently driven by different biological niches. Different functional categories of proteins as well as different protein families show a broad distribution of OC values, from near 1 (a perfect, one-to-one correspondence) to near 0. The proteins involved in core biological functions, such as genome replication and expression, typically show high OC values. In contrast, marked variability is seen among proteins that are involved in specific processes, such as nutrient transport, metabolism, gene-specific transcription regulation, signal transduction, and host response. Differences in the gene complements encoded in the two spirochete genomes suggest active adaptive evolution for their distinct niches. Comparative analysis of the spirochete genomes produced evidence of gene exchanges with other bacteria, archaea, and eukaryotic hosts that seem to have occurred at different points in the evolution of the spirochetes. Examples are presented of the use of sequence profile analysis to predict proteins that are likely to play a role in pathogenesis, including secreted proteins that contain specific protein-protein interaction domains, such as von Willebrand A, YWTD, TPR, and PR1, some of which hitherto have been reported only in eukaryotes. We tentatively reconstruct the likely evolutionary process that has led to the divergence of the two spirochete lineages; this reconstruction seems to point to an ancestral state resembling the symbiotic spirochetes found in insect guts.

PubMed Disclaimer

Figures

FIG. 1
FIG. 1
OC as a quantitative measure of orthology between different types of protein categories. (A) In the case of a one-to-one correspondence between genes in two genomes, OC = 2No/(N1 + N2) where No is the number of orthologs and N1 and N2 are the numbers of members of the given protein family or functional category in the two compared genomes. (B) If there is a duplication (two or more members) in one or both of the species that occurred after divergence of the two species, then OC = (No1 + No2)/(N1 + N2) where No1 and No2 are the numbers of members in orthologous clusters from the two respective genomes.
FIG. 2
FIG. 2
OCs for functional classes of proteins. The annotated predicted open reading frames from the proteomes of B. burgdorferi and T. pallidum were grouped by the predicted functional class of the protein. Orthologous proteins within each functional class were identified as described, and the OC values for each class were calculated as in Fig. 1. The vertical axis on the left shows the absolute number of proteins, and the one on the right shows OC values. OCs of >0.8 delineate highly conserved protein classes, OCs in the range of 0.6 to 0.8 are considered intermediate, and OCs of <0.6 represent significant divergence between the two genomes.
FIG. 3
FIG. 3
Phylogenetic trees for representative examples of xenologous gene displacement and horizontal gene transfer in the spirochetes. The trees were constructed using alignments generated with CLUSTALW, followed by manual adjustments based on PSI-BLAST outputs. The neighbor-joining (NEIGHBOR program from Phylip) and least squares (from the PAUP package) methods were used to construct the trees, and 1,000 bootstrap replications were performed. The nodes supported by bootstrap values greater than 75% are indicated with the symbol “○.” All genes are shown with the mnemonic species name along with the GenBank accession number. Species abbreviations are as follows: Tp, T. pallidum; Bb, B. burgdorferi; Bs, B. subtilis; Ec, E. coli; Hi, H. influenzae; Pa, Pseudomonas aeruginosa; Pdn, Pseudomonas denitrificans; Rp, R. prowazekii; Zm, Zymomonas mobilis; Acom, Actinobacillus actinomycetemcomitans; Mtu, Mycobacterium tuberculosis; Mle, Mycobacterium leprae; Ssp, Synechocystis sp.; Ssc, Synechococcus sp.; Ana, Anabaena sp.; Nos, Nostoc sp.; Cpn, Chlamydia pneumoniae; Ct, Chlamydia trachomatis; Tm, T. maritima; Aae, Aquifex aeolicus; Ap, Aeropyrum pernix; Af, Archaeoglobus fulgidus; Mta, Methanobacterium thermoautotrophicum; Mj, Methanococcus jannaschii; Ph, Pyrococcus horikoshi; Um, Ustilago maydis; Poda, Podospora anserina; Sp, Schizosaccharomyces pombe; Sc, Saccharomyces cerevisiae; Ce, Caenorhabditis elegans; Hs, Homo sapiens; Dm, Drosophila melanogaster; Mm, Mus musculus; Oncvo, Onchocerca volvulus; At, Arabidopsis thaliana; Horvu, Hordeum vulgare; Soltu, Solanum tuberosum; Gi, Giardia intestinalis; Tbr, Trypanosoma brucei; En, Entamoeba histolytica.
FIG. 3
FIG. 3
Phylogenetic trees for representative examples of xenologous gene displacement and horizontal gene transfer in the spirochetes. The trees were constructed using alignments generated with CLUSTALW, followed by manual adjustments based on PSI-BLAST outputs. The neighbor-joining (NEIGHBOR program from Phylip) and least squares (from the PAUP package) methods were used to construct the trees, and 1,000 bootstrap replications were performed. The nodes supported by bootstrap values greater than 75% are indicated with the symbol “○.” All genes are shown with the mnemonic species name along with the GenBank accession number. Species abbreviations are as follows: Tp, T. pallidum; Bb, B. burgdorferi; Bs, B. subtilis; Ec, E. coli; Hi, H. influenzae; Pa, Pseudomonas aeruginosa; Pdn, Pseudomonas denitrificans; Rp, R. prowazekii; Zm, Zymomonas mobilis; Acom, Actinobacillus actinomycetemcomitans; Mtu, Mycobacterium tuberculosis; Mle, Mycobacterium leprae; Ssp, Synechocystis sp.; Ssc, Synechococcus sp.; Ana, Anabaena sp.; Nos, Nostoc sp.; Cpn, Chlamydia pneumoniae; Ct, Chlamydia trachomatis; Tm, T. maritima; Aae, Aquifex aeolicus; Ap, Aeropyrum pernix; Af, Archaeoglobus fulgidus; Mta, Methanobacterium thermoautotrophicum; Mj, Methanococcus jannaschii; Ph, Pyrococcus horikoshi; Um, Ustilago maydis; Poda, Podospora anserina; Sp, Schizosaccharomyces pombe; Sc, Saccharomyces cerevisiae; Ce, Caenorhabditis elegans; Hs, Homo sapiens; Dm, Drosophila melanogaster; Mm, Mus musculus; Oncvo, Onchocerca volvulus; At, Arabidopsis thaliana; Horvu, Hordeum vulgare; Soltu, Solanum tuberosum; Gi, Giardia intestinalis; Tbr, Trypanosoma brucei; En, Entamoeba histolytica.
FIG. 4
FIG. 4
Multiple alignments of previously uncharacterized protein domains discovered in the course of comparative genome analysis of the spirochetes. (A) Resolvase-type HTH domains identified in large proteins containing an N-terminal signal peptide. (B) RNA polymerase-interacting domain found in CarD-like proteins and TRCF. (C) von Willebrand factor A domains in the spirochetes. (D) PR-1 domain in B. burgdorferi. The alignments were constructed using PSI-BLAST-derived pairwise alignments as input for the CLUSTALW program and were further adjusted based on the PSI-BLAST output. Proteins are designated with the protein name, species of origin, and GenBank accession number. Numbers preceding and following the aligned regions refer to the amino acid positions of the domain within each protein sequence. Coloring is according to the consensus shown below the alignment. Uppercase letters indicate conserved residues, by the single-letter amino acid code. Lowercase letters indicate conserved classes of amino acids that are highlighted using differential coloring or shading as follows: violet, polar residues (p) (C,D,E,H,K,N,Q,R,S,T); pink, charged residues (c) (D,E,H,K,R); green background, tiny residues (u) (A,G,S); green shading, small residues (s) (A,C,D,N,G,P,S,T,V); grey shading, bulky residues (b) (E,F,I,K,L,M,Q,R,W,Y); yellow shading, hydrophobic residues (h) (A,C,F,I,L,M,V,W,Y), aromatic residues (a) (F,H,W,Y), and aliphatic residues (l) (I,L,V). Species abbreviations are as follows: Hp, H. pylori; Mx, Myxococcus xanthus; Pf, Plasmodium falciparum. Others are as indicated in the legend to Fig. 3.
FIG. 5
FIG. 5
Differences and conservation in the domain architectures of proteins involved in signaling in B. burgdorferi and T. pallidum. Proteins unique to either organism are shown in the outer panels; the middle panel represents proteins conserved in the two organisms. Domain identification/organization is based on sequence searches using PSI-BLAST-derived PSSMs and Hidden Markov models derived using the HMMER2 package and was confirmed by detailed examination of sequence alignments. The figure is drawn roughly to scale. The double slash (//) indicates that a portion of the long sequence has been omitted, blue vertical bars represent transmembrane helices predicted using the TopPred II program, and the red horizontal bar represents a signal peptide predicted using the SignalP program. Domain abbreviations are as follows: MeTr, methyl transferase; MeAc, methyl acceptor; MeEst, methyl esterase; cheW, two-component signaling adaptor domain; Rec, receiver domain; AC, adenylyl cyclase; IIA/IIB/IIC, components of phosphoenolpyruvate:sugar phosphotransferase (PTS) system; PAS, Per/Arnt/Sim domain; HisK, histidine kinase; PhHis, phosphohistidine; HPT, histidine phosphotransferase; TIM, TIM barrel domain; DGC/P, diguanylate cyclase/phosphodiesterase domain; HAMP, domain common to histidine kinase, adenylyl cyclase, methyl acceptor, and PP2C phosphatase (11); HTH, helix-turn-helix; HD, HD superfamily phosphodiesterase domain; GAF, domain present in cGMP-specific phosphodiesterase (cyclic nucleotide binding); cAMP, cyclic AMP binding domain; TPR, tetratricopeptide repeat; PP2C, PP2C family phosphatase; ANK, ankyrin repeat; CBS, cystathionine-beta synthase domain (a widespread domain implicated in signaling [13]); HPT, histidine-containing phosphotransfer domain; ACT, aspartokinase, chorismate mutase, TyrA domain (8); TGS, threonyl-tRNA synthetase, GTPase, SpoT domain (78); HPRK, HPR kinase; HPr, phosphocarrier protein; PhHis, phosphohistidine; X, conserved domain of unknown function seen in the SpoT protein; fliY, domain common to CheX and FliY/M proteins.
FIG. 6
FIG. 6
OCs for protein families and superfamilies identified in the two spirochetes. Protein and domain (super)families within the two spirochete proteomes were identified using transitive BLASTP searches and family-specific PSSMs, as described in Materials and Methods. The OC was computed for each of these (super)families as described in the text.

Similar articles

Cited by

References

    1. Akbar S, Kang C M, Gaidenko T A, Price C W. Modulator protein RsbR regulates environmental signalling in the general stress pathway of Bacillus subtilis. Mol Microbiol. 1997;24:567–578. - PubMed
    1. Altschul S F, Koonin E V. Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases. Trends Biochem Sci. 1998;23:444–447. - PubMed
    1. Altschul S F, Madden T L, Schaffer A A, Zhang J, Zhang Z, Miller W, Lipman D J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Amin U S, Lash T D, Wilkinson B J. Proline betaine is a highly effective osmoprotectant for Staphylococcus aureus. Arch Microbiol. 1995;163:138–142. - PubMed
    1. Andersson S G, Zomorodipour A, Andersson J O, Sicheritz-Ponten T, Alsmark U C, Podowski R M, Naslund A K, Eriksson A S, Winkler H H, Kurland C G. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature. 1998;396:133–140. - PubMed

Publication types

MeSH terms

LinkOut - more resources