Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2012 Jun;194(12):3199-215.
doi: 10.1128/JB.00183-12. Epub 2012 Apr 13.

Pangenomic study of Corynebacterium diphtheriae that provides insights into the genomic diversity of pathogenic isolates from cases of classical diphtheria, endocarditis, and pneumonia

Affiliations
Comparative Study

Pangenomic study of Corynebacterium diphtheriae that provides insights into the genomic diversity of pathogenic isolates from cases of classical diphtheria, endocarditis, and pneumonia

Eva Trost et al. J Bacteriol. 2012 Jun.

Abstract

Corynebacterium diphtheriae is one of the most prominent human pathogens and the causative agent of the communicable disease diphtheria. The genomes of 12 strains isolated from patients with classical diphtheria, endocarditis, and pneumonia were completely sequenced and annotated. Including the genome of C. diphtheriae NCTC 13129, we herewith present a comprehensive comparative analysis of 13 strains and the first characterization of the pangenome of the species C. diphtheriae. Comparative genomics showed extensive synteny and revealed a core genome consisting of 1,632 conserved genes. The pangenome currently comprises 4,786 protein-coding regions and increases at an average of 65 unique genes per newly sequenced strain. Analysis of prophages carrying the diphtheria toxin gene tox revealed that the toxoid vaccine producer C. diphtheriae Park-Williams no. 8 has been lysogenized by two copies of the ω(tox)(+) phage, whereas C. diphtheriae 31A harbors a hitherto-unknown tox(+) corynephage. DNA binding sites of the tox-controlling regulator DtxR were detected by genome-wide motif searches. Comparative content analysis showed that the DtxR regulons exhibit marked differences due to gene gain, gene loss, partial gene deletion, and DtxR binding site depletion. Most predicted pathogenicity islands of C. diphtheriae revealed characteristics of horizontal gene transfer. The majority of these islands encode subunits of adhesive pili, which can play important roles in adhesion of C. diphtheriae to different host tissues. All sequenced isolates contain at least two pilus gene clusters. It appears that variation in the distributed genome is a common strategy of C. diphtheriae to establish differences in host-pathogen interactions.

PubMed Disclaimer

Figures

Fig 1
Fig 1
Phylogenetic trees of hitherto-sequenced C. diphtheriae strains based on allelic profiles of housekeeping genes (A) and variations in the deduced core genome (B). Allelic profiles of the housekeeping genes were determined according to references deposited in the C. diphtheriae MLST database mlstdbNet. The dendrogram was calculated with the PHYLIP package using the unweighted-pair group method with arithmetic mean (UPGMA). The core genome of C. diphtheriae was deduced with EDGAR software and includes 1,632 genes. The dendrogram was calculated with the EDGAR system using the neighbor-joining method.
Fig 2
Fig 2
Pairwise comparison of the gene contents of hitherto-sequenced C. diphtheriae strains. Similarities denote the number of genes shared by a particular pair of strains, differences display the number of genes not shared within a pair of strains, and pair-uniques correspond to orthologous genes shared only by a distinct strain pair. All calculations were carried out with the software tool EDGAR. The highest and lowest values of each category are listed and specifically marked.
Fig 3
Fig 3
Development of the number of core genes and singletons as a function of the number of sequenced C. diphtheriae strains. The respective numbers were calculated for two strains and then iteratively for an increasing number of sequenced genomes, added one by one. The deduced equations denote the exponential decay model based on the median number of core genes and singletons, when increasing numbers of genomes were compared.
Fig 4
Fig 4
Whole-genome alignment of the 13 sequenced C. diphtheriae strains. The nucleotide sequence alignment was calculated with the software Mauve using the genome of C. diphtheriae C7(β)tox+ as a reference. Each genome is presented in linear view, and homologous DNA segments are shown as colored blocks. The position of the origin of replication (oriC) is indicated. Identified transposase genes are marked by black lines. DNA segments mentioned in the text are labeled as follows: ϕ, additional prophage in C7(β)tox+; ω+P, ω prophages and adjacent pilus gene cluster in PW8; tox, tox+ prophage in 31A; AB, antibiotic resistance gene region in BH8.
Fig 5
Fig 5
Heaps' law plot representing the development of the pangenome of C. diphtheriae. The total number of genes found according to the pangenome analysis is shown for increasing numbers of sequenced C. diphtheriae genomes. Medians of the distributions are shown by squares; permutations are indicated.
Fig 6
Fig 6
Genome alignment of tox+ prophages identified in the sequenced C. diphtheriae strains. The nucleotide sequence alignment was calculated with the software Mauve. The height of the plot denotes the similarities of the aligned DNA sequences. The tox gene is located at the right-hand end of the prophage genome. The proposed modular structure of the corynephage present in C. diphtheriae NCTC 13129 is indicated by annotated brackets.
Fig 7
Fig 7
Comparison of predicted DtxR regulons encoded in the sequenced C. diphtheriae genomes. Genes and gene clusters specified by the presence of DtxR binding sites are listed with their proposed physiological functions. The presence of DtxR binding sites is represented by gray boxes. White boxes denote gene clusters and corresponding DtxR binding sites missing in the respective genomes. Specifically marked are the duplication of the tox gene in PW8 (2 × tox), the deletion in the sidBA gene region (ΔsidBA), and the integration of an insertion sequence into the regulatory region of the nitrate reductase gene cluster (IS). Genes assigned to the same cluster are linked with hyphens. The position of the DtxR binding site is marked by double slashes if it is located between two divergently oriented gene clusters. The asterisks label gene clusters with experimental information on DtxR regulation.
Fig 8
Fig 8
Genomic islands detected in the sequenced C. diphtheriae genomes and comparison of the predicted gene contents. The genomic islands were identified with the software PIPS, and the deduced similarities are shown as percentages.
Fig 9
Fig 9
Overview of pilus gene clusters found in the sequenced C. diphtheriae strains in relation to the reference strain C. diphtheriae NCTC 13129. Homologous pilin genes are indicated by color; sortase genes are shown in dark gray. Genes encoding hypothetical proteins in the pilus gene cluster of C. diphtheriae PW8 are shown in light gray; mobile elements are labeled in yellow. Genes similar to SpaD and SpaH types are denoted with primes. Asterisks and hatched arrows indicate fragmented genes.
Fig 10
Fig 10
Analysis of pilus shaft proteins (A) and the corresponding sortases (B). ClustalW2 was used to align the protein sequences for major pilin subunits and the predicted sortases of 13 C. diphtheriae strains. Their phylogenetic trees were reconstructed with the neighbor-joining algorithm using MEGA 4.0 software. Locus tags are color-coded to indicate the pilus types of the reference strain C. diphtheriae NCTC 13129. Proteins similar to SpaH, SrtD, and SrtE are denoted with primes.
Fig 11
Fig 11
Phylogenetic trees of the tip (A) and base (B) pilus proteins. ClustalW2 was used to align the protein sequences for tip and base pilins of 13 C. diphtheriae strains. Phylogenetic trees were reconstructed with the neighbor-joining algorithm using MEGA 4.0 software. Locus tags are color-coded to indicate the pilus types of the reference strain C. diphtheriae NCTC 13129. Proteins similar to SpaG and SpaI are denoted with primes.

References

    1. Andrews SC, Robinson AK, Rodriguez-Quinones F. 2003. Bacterial iron homeostasis. FEMS Microbiol. Rev. 27:215–237 - PubMed
    1. Badger JH, Olsen GJ. 1999. CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16:512–524 - PubMed
    1. Barksdale WL, Pappenheimer AM., Jr 1954. Phage-host relationships in nontoxigenic and toxigenic diphtheria bacilli. J. Bacteriol. 67:220–232 - PMC - PubMed
    1. Blom J, et al. 2009. EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics 10:154. - PMC - PubMed
    1. Bottacini F, et al. 2010. Comparative genomics of the genus Bifidobacterium. Microbiology 156:3243–3254 - PubMed

Publication types

MeSH terms

Associated data

LinkOut - more resources