Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Jul;79(13):4115-28.
doi: 10.1128/AEM.00817-13. Epub 2013 Apr 26.

Distribution of cp32 prophages among Lyme disease-causing spirochetes and natural diversity of their lipoprotein-encoding erp loci

Affiliations
Comparative Study

Distribution of cp32 prophages among Lyme disease-causing spirochetes and natural diversity of their lipoprotein-encoding erp loci

Dustin Brisson et al. Appl Environ Microbiol. 2013 Jul.

Abstract

Lyme disease spirochetes possess complex genomes, consisting of a main chromosome and 20 or more smaller replicons. Among those small DNAs are the cp32 elements, a family of prophages that replicate as circular episomes. All complete cp32s contain an erp locus, which encodes surface-exposed proteins. Sequences were compared for all 193 erp alleles carried by 22 different strains of Lyme disease-causing spirochete to investigate their natural diversity and evolutionary histories. These included multiple isolates from a focus where Lyme disease is endemic in the northeastern United States and isolates from across North America and Europe. Bacteria were derived from diseased humans and from vector ticks and included members of 5 different Borrelia genospecies. All erp operon 5'-noncoding regions were found to be highly conserved, as were the initial 70 to 80 bp of all erp open reading frames, traits indicative of a common evolutionary origin. However, the majority of the protein-coding regions are highly diverse, due to numerous intra- and intergenic recombination events. Most erp alleles are chimeras derived from sequences of closely related and distantly related erp sequences and from unknown origins. Since known functions of Erp surface proteins involve interactions with various host tissue components, this diversity may reflect both their multiple functions and the abilities of Lyme disease-causing spirochetes to successfully infect a wide variety of vertebrate host species.

PubMed Disclaimer

Figures

Fig 1
Fig 1
Neighbor-joining tree of the ParA-like proteins encoded on cp32 elements. The tree was constructed from all the proteins encoded by the parA-like genes of the cp32s of the sequenced B. burgdorferi sensu lato genomes and those known from strains CA15, Sh-2-82, and Far04. Numbers on the tree branches are bootstrap support values from 1,000 trials. Blue and yellow boxes highlight the different sequence types (tree branches) for the cp32 ParA proteins. These types correlate with episomes named in large red type. The species which harbored each cp32 are indicated by the color of the strain name at the right branch tips (see color key in upper left of figure). The DN127 protein labeled DN127frag in the cp32-12 branch is a ParA fragment from the DN127 cp32-Quad element. All the replicons in the figure carry highly related complements of non-partition cluster genes, except for seven “lp32” plasmids, noted by a pink background in the figure, which encode cp32-like ParA proteins but do not carry erp sequences (these will be described elsewhere [S. Casjens, unpublished data]). Replicon cp26 is an unrelated, universally present circular “plasmid” and is shown only for reference. Bar, 0.05 fractional change.
Fig 2
Fig 2
The 5′-noncoding and amino-terminal-coding sequences are conserved among erp operons. (A) Alignment of 5′-noncoding DNAs of 13 representative erp operons. Nucleotides occurring in >50% of all loci are boxed in blue. The ATG initiation codon of the proximal erp gene is on the right. On the far right are indications of whether each operon contains one or two erp open reading frames and the relative sizes of those sequences (S, small-sized erp; M, medium-sized erp; L, large-sized erp). The promoter −10 and −35 sites and the high-affinity binding sites of the regulatory factors BpaB, EbfC, and Bpur are indicated above the alignment (our unpublished results and references , , , and 82). (B) Alignment of 5′ ends of the erp alleles characterized below in Fig. 5A to C. Nucleotides occurring in >50% of all loci are boxed in blue. The green bar above the alignment indicates the codon for the lipid-modified cysteine of the mature lipoprotein. The leader polypeptide is encoded by nucleotides to the left of the cysteine codon, and localization/trafficking signals are encoded by the nucleotides to the right (85). The relative size of each gene is indicated to the left of the alignment.
Fig 3
Fig 3
Sequence similarities within and between erp sequence size classes are limited, except within the small-sized sequences. For each comparison, the horizontal line is the median, the box indicates the upper and lower quartiles, and whiskers indicate the maximum and minimum values. Average pairwise similarities among sequences within the large-sized sequences or within the medium-sized sequences are very low, with some pairs of sequences demonstrating as little as 17% identity. Small-sized sequences are a more homogeneous group, with an average of 69% sequence identity among pairs of sequences. Identity among pairs of sequences from different size groups was generally low, although some pairs of sequences between groups were more similar than pairs of sequences within groups. This result may indicate the presence of recombination events among sequences from different groups. Gap regions were not included in these analyses.
Fig 4
Fig 4
Sequence diversity is generated by intragenic recombination. Evidence of intragenic recombination was apparent in the majority of analyzed alleles from small-, medium, and large-sized erp sequences (A, B, and C, respectively). The locations of recombination events are visualized by the changes in colors along the line representing the erp sequence; the identity of the donor erp allele is denoted above the colored line. For example, erp sequences 118a_AB39, 72a_N39, N40_erp26, and DN127_R39 (C, top) all contain a segment homologous to the erp sequence 94a_M41 near the 3′ end of the protein. Despite the diversity generated by recombination, identical alleles are present in multiple strains in all three size classes of erp sequences. All described recombination events were supported by at least three independent algorithms (see Table S1 in the supplemental material). Recombination events between alleles of different size classes could not be detected by the methods used here. The sequences of erp alleles carried by strains BL206 and Sh-2-82 are identical to sequences maintained by strains B31 and 297, respectively, and were omitted from these analyses since they did not provide unique information on genetic diversity (this work and reference 46). Asterisks at tree branches indicate bootstrap support values from 1,000 trials (***, >950; **, >850; *, >750; unmarked, ≤750).
Fig 4
Fig 4
Sequence diversity is generated by intragenic recombination. Evidence of intragenic recombination was apparent in the majority of analyzed alleles from small-, medium, and large-sized erp sequences (A, B, and C, respectively). The locations of recombination events are visualized by the changes in colors along the line representing the erp sequence; the identity of the donor erp allele is denoted above the colored line. For example, erp sequences 118a_AB39, 72a_N39, N40_erp26, and DN127_R39 (C, top) all contain a segment homologous to the erp sequence 94a_M41 near the 3′ end of the protein. Despite the diversity generated by recombination, identical alleles are present in multiple strains in all three size classes of erp sequences. All described recombination events were supported by at least three independent algorithms (see Table S1 in the supplemental material). Recombination events between alleles of different size classes could not be detected by the methods used here. The sequences of erp alleles carried by strains BL206 and Sh-2-82 are identical to sequences maintained by strains B31 and 297, respectively, and were omitted from these analyses since they did not provide unique information on genetic diversity (this work and reference 46). Asterisks at tree branches indicate bootstrap support values from 1,000 trials (***, >950; **, >850; *, >750; unmarked, ≤750).
Fig 4
Fig 4
Sequence diversity is generated by intragenic recombination. Evidence of intragenic recombination was apparent in the majority of analyzed alleles from small-, medium, and large-sized erp sequences (A, B, and C, respectively). The locations of recombination events are visualized by the changes in colors along the line representing the erp sequence; the identity of the donor erp allele is denoted above the colored line. For example, erp sequences 118a_AB39, 72a_N39, N40_erp26, and DN127_R39 (C, top) all contain a segment homologous to the erp sequence 94a_M41 near the 3′ end of the protein. Despite the diversity generated by recombination, identical alleles are present in multiple strains in all three size classes of erp sequences. All described recombination events were supported by at least three independent algorithms (see Table S1 in the supplemental material). Recombination events between alleles of different size classes could not be detected by the methods used here. The sequences of erp alleles carried by strains BL206 and Sh-2-82 are identical to sequences maintained by strains B31 and 297, respectively, and were omitted from these analyses since they did not provide unique information on genetic diversity (this work and reference 46). Asterisks at tree branches indicate bootstrap support values from 1,000 trials (***, >950; **, >850; *, >750; unmarked, ≤750).
Fig 5
Fig 5
Alignments of representative erp sequences, illustrating sequence similarities and diversities within each size group. For all panels, bases that share >50% identity are boxed in blue and spaces introduced to maximize alignment are indicated by dashes. (A) Small-sized erp sequences; (B) medium-sized erp sequences; (C) large-sized erp sequences.
Fig 6
Fig 6
Strains 118a, 72a, CA-11.2A, and 94a each contain an identical erp gene that is a chimera of a small- and a medium-sized erp gene. Illustrated are a schematic and a sequence alignment comparing the strain 118a Q40 gene with the strain B31 erpA gene, which is a similar small-sized allele, and the strain 118a R38 gene, which is a similar medium-sized allele. All three open reading frames possess the characteristically conserved erp 5′ sequence, indicated in blue. Sequences of 118a Q40 that are similar to those of erpA or R38 are indicated by green or orange, respectively. In the sequence alignment, sequences of erpA and R38 that are identical to those of Q40 are indicated by asterisks. The approximate region of the recombination event that combined a small- and medium-sized erp to produce the chimera is bordered by red arrowheads.

Similar articles

Cited by

References

    1. Fraser CM, Casjens S, Huang WM, Sutton GG, Clayton R, Lathigra R, White O, Ketchum KA, Dodson R, Hickey EK, Gwinn M, Dougherty B, Tomb J-F, Fleischmann RD, Richardson D, Peterson J, Kerlavage AR, Quackenbush J, Salzberg S, Hanson M, van Vugt R, Palmer N, Adams MD, Gocayne J, Weidmann J, Utterback T, Watthey L, McDonald L, Artiach P, Bowman C, Garland S, Fujii C, Cotton MD, Horst K, Roberts K, Hatch B, Smith HO, Venter JC. 1997. Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 390:580–586 - PubMed
    1. Casjens S, van Vugt R, Tilly K, Rosa PA, Stevenson B. 1997. Homology throughout the multiple 32-kilobase circular plasmids present in Lyme disease spirochetes. J. Bacteriol. 179:217–227 - PMC - PubMed
    1. Casjens S, Palmer N, van Vugt R, Huang WM, Stevenson B, Rosa P, Lathigra R, Sutton G, Peterson J, Dodson RJ, Haft D, Hickey E, Gwinn M, White O, Fraser C. 2000. A bacterial genome in flux: the twelve linear and nine circular extrachromosomal DNAs of an infectious isolate of the Lyme disease spirochete Borrelia burgdorferi. Mol. Microbiol. 35:490–516 - PubMed
    1. Miller JC, Bono JL, Babb K, El-Hage N, Casjens S, Stevenson B. 2000. A second allele of eppA in Borrelia burgdorferi strain B31 is located on the previously undetected circular plasmid cp9-2. J. Bacteriol. 182:6254–6258 - PMC - PubMed
    1. Casjens SR, Eggers CH, Schwartz I. 2010. Borrelia genomics: chromosome, plasmids, bacteriophages and genetic variation, p 27–54 In Samuels DS, Radolf JD. (ed), Borrelia: molecular biology, host interaction and pathogenesis. Caister Academic Press, Norfolk, United Kingdom

Publication types

Substances

LinkOut - more resources