Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Dec 12:6:107.
doi: 10.1186/1471-2148-6-107.

Insights into the evolutionary history of tubercle bacilli as disclosed by genetic rearrangements within a PE_PGRS duplicated gene pair

Affiliations
Comparative Study

Insights into the evolutionary history of tubercle bacilli as disclosed by genetic rearrangements within a PE_PGRS duplicated gene pair

Anis Karboul et al. BMC Evol Biol. .

Abstract

Background: The highly homologous PE_PGRS (Proline-glutamic acid_polymorphic GC-rich repetitive sequence) genes are members of the PE multigene family which is found only in mycobacteria. PE genes are particularly abundant within the genomes of pathogenic mycobacteria where they seem to have expanded as a result of gene duplication events. PE_PGRS genes are characterized by their high GC content and extensive repetitive sequences, making them prone to recombination events and genetic variability.

Results: Comparative sequence analysis of Mycobacterium tuberculosis genes PE_PGRS17 (Rv0978c) and PE_PGRS18 (Rv0980c) revealed a striking genetic variation associated with this typical tandem duplicate. In comparison to the M. tuberculosis reference strain H37Rv, the variation (named the 12/40 polymorphism) consists of an in-frame 12-bp insertion invariably accompanied by a set of 40 single nucleotide polymorphisms (SNPs) that occurs either in PE_PGRS17 or in both genes. Sequence analysis of the paralogous genes in a representative set of worldwide distributed tubercle bacilli isolates revealed data which supported previously proposed evolutionary scenarios for the M. tuberculosis complex (MTBC) and confirmed the very ancient origin of "M. canettii" and other smooth tubercle bacilli. Strikingly, the identified polymorphism appears to be coincident with the emergence of the post-bottleneck successful clone from which the MTBC expanded. Furthermore, the findings provide direct and clear evidence for the natural occurrence of gene conversion in mycobacteria, which appears to be restricted to modern M. tuberculosis strains.

Conclusion: This study provides a new perspective to explore the molecular events that accompanied the evolution, clonal expansion, and recent diversification of tubercle bacilli.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pairwise amino acid similarity of contiguous PE genes of the M. tuberculosis strain H37Rv. The percent amino acid similarity values were calculated using the BioEdit program [52].
Figure 2
Figure 2
Genomic context and homology of the PE_PGRS17 and PE_PGRS18 genes. Genetic map showing the genomic context of the PE_PGRS17 (Rv0978c) and PE_PGRS18 (Rv0980c) genes within the genome of M. tuberculosis H37Rv. (B) Schematic representation of the homology shared between the PE_PGRS17 and PE_PGRS18 nucleotide sequences. Note the sequence relatedness with the PE_PGRS 45 (Rv2615c). MAR: major alignable coding region.
Figure 3
Figure 3
Distribution of the 12/40 polymorphism. (A). Schematic representation of PE_PGRS17 and PE_PGRS18 showing the distribution of the 12/40 polymorphism (red block) between paralogous and orthologous sequences. (B). Multiple sequence alignment of the 12/40 polymorphism region (highlighted with grey) in PE_PGRS17 and PE_PGRS18 of the three sequenced mycobacterial genomes of M. tuberculosis H37Rv (H37Rv), M. tuberculosis CDC1551 (CDC1551) and M. bovis AF2122/97 (M. bov AF2122/97). The corresponding sequences in M. tuberculosis strain H37Ra were determined by sequencing and proved to be identical to H37Rv. MAR: major alignable coding region.
Figure 4
Figure 4
A plot graph showing the genetic variability within PE_PGRS17 and PE_PGRS18. The plot shows the distribution of the 12/40 polymorphism and the other SNPs among 101 worldwide distributed tubercle bacilli isolates (98 tubercle bacilli sequenced in this study along with the sequences from M. tuberculosis reference strains H37Rv, CDC1551, and M. bovis reference strain AF2122/97). SNPs (relative to the M. tuberculosis reference strain H37Rv) are shown in colours other than orange, whereas grey background indicates the presence of a deletion. M. tb M. tuberculosis, M. afri M. africanum, M. cap M. caprae, M. pin M. pinnipedii, STB-Smooth tubercle bacilli.
Figure 5
Figure 5
The PE grouping assay (PEGAssay) pattern. Representation of the hybridization pattern results obtained using PEGAssay on selected members of the M. tuberculosis complex and other mycobacteria (M. canettii and M. smegmatis). Lanes 17 and 18 correspond to PE_PGRS17 and PE_PGRS18, respectively. MTBC strains producing a single reacting dot with PE_PGRS 17 (presence of the 12/40 polymorphism in PE_PGRS17 but not in PE_PGRS18) are referred to as "+/-" and belong to the PGRS type 1 group (PGRST1). Isolates whose PEGAssay pattern produces two reacting dots (presence of the 12/40 polymorphism in both PE_PGRS17 and PE_PGRS18 genes) are referred to as "+/+" and belong to the PGRS type 2 genotype (PGRST2). Finally, isolates whose PEGAssay produces no hybridization signal (absence of the 12/40 polymorphism from both PE_PGRS genes) are referred to as "-/-" and belong to the PGRS type 3 genotypic group (PGRST3). Note that M. smegmatis strain mc2 155, which lacks both PE_PGRS17 and PE_PGRS18 whole genes, produces no hybridization signal.
Figure 6
Figure 6
Distribution of the three PE_PGRS-based PGRST groups within the three Principal Genetic Groups (PGGs). (A)- Distribution throughout the whole collection (521 strains), (1)- Ancestral M. tuberculosis, (2)- Modern M. tuberculosis, (3)- M. bovis + subspecies, (4)- M. africanum, (5)- M. microti + subspecies and (6)- "M. canettii" (B) Distribution in Tunisian M. tuberculosis collection (C) Distribution in American M. tuberculosis collection (D) Distribution in South African M. tuberculosis collection Blue bars: PGRST1 (+/-), Yellow bars: PGRST2 (+/+), Orange bars: PGRST3 (-/-)
Figure 7
Figure 7
Genetic variability of PE_PGRS17 and PE_PGRS18 genes throughout a worldwide collection of tubercle bacilli isolates. The sequenced region encompasses nucleotides 31 to 712 for both genes (numbering according to the reference strain H37Rv). Sequences of both genes were concatenated and, for MTBC strains, each unique sequence was assigned a type number (T1 to T26). Yellow boxes indicate that the sequence at this site is identical to that of the reference strain M. tuberculosis H37Rv. Blue and orange backgrounds correspond to genetic variations (in comparison to H37Rv) within PE_PGRS17 and PE_PGRS18 sequenced regions, respectively. White background indicates no sequence available. STB: Smooth tubercle bacillus. I1: A trinucleotide (GCC) insertion immediately after position 90. I2; A 1-nucleotide (A) frameshift insertion immediately after position 270. Δ1: A 114-nucleotide in frame deletion from positions 512 to 625. Δ2: A 9-nucleotide in frame deletion from positions 631 to 639.
Figure 8
Figure 8
Schematic representation of the proposed evolutionary history of the tubercle bacilli according to the presence and absence of the 12/40 polymorphism. The scenario was constructed from the distribution of the 12/40 polymorphism within and between species. This scenario fits in with previously-proposed evolutionary schemes based on deletion regions and single nucleotide polymorphisms [15, 15, 19]
Figure 9
Figure 9
Distribution of the three PGRS types through sSNP-based genetic clusters and PGG groups. The Phylogenetic tree of M. tuberculosis isolates from New York and New Jersey shows the relative distribution of PGRS types 1 (PGRST1; +/-), 2 (PGRST2; +/+), and 3 (PGRST3; -/-) with respect to the synonymous single-nucleotide (sSNP)-defined 9 major genetic clusters (I, II, IIA, and III to VIII) and PGG grouping. The 36 sSNP-based phylogenetic tree was constructed as described by Gutacker et al. [33]. The data clearly indicate that the PE_PGRS17/PE_PGRS18-associated gene conversion event occurs multiple times mainly in PGG2 strains.

Similar articles

Cited by

References

    1. World Health Organization . Global Tuberculosis Control Surveillance, Planning, Financing. World Health Organization, Geneva; 2005.
    1. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE, 3rd, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998;393:537–544. doi: 10.1038/31159. - DOI - PubMed
    1. Garnier T, Eiglmeier K, Camus JC, Medina N, Mansoor H, Pryor M, Duthoy S, Grondin S, Lacroix C, Monsempe C, et al. The complete genome sequence of Mycobacterium bovis . Proc Natl Acad Sci USA. 2003;100:7877–7882. doi: 10.1073/pnas.1130426100. - DOI - PMC - PubMed
    1. Sreevatsan S, Pan X, Stockbauer KE, Connell ND, Kreiswirth BN, Whittam TS, Musser JM. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Natl Acad Sci USA. 1997;94:9869–9874. doi: 10.1073/pnas.94.18.9869. - DOI - PMC - PubMed
    1. Musser JM, Amin A, Ramaswamy S. Negligible genetic diversity of Mycobacterium tuberculosis host immune system protein targets: evidence of limited selective pressure. Genetics. 2000;155:7–16. - PMC - PubMed

Publication types

LinkOut - more resources