. 2020 Dec 28;16(12):e1009181.

doi: 10.1371/journal.ppat.1009181. eCollection 2020 Dec.

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Michael J Mansfield^{1

2}, Benjamin J-M Tremblay¹, Ji Zeng^{3

4}, Xin Wei¹, Harold Hodgins¹, Jay Worley^{5

6}, Lynn Bry^{5

7}, Min Dong^{3

4}, Andrew C Doxey¹

Affiliations

¹ Department of Biology, David R. Cheriton School of Computer Science, and Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, Ontario, Canada.
² Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan.
³ Department of Urology, Boston Children's Hospital, Boston, Massachusetts, United States of America.
⁴ Department of Microbiology, Harvard Medical School, Boston, Massachusetts, United States of America.
⁵ Massachusetts Host-Microbiome Center, Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.
⁶ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
⁷ Division of Infectious Diseases, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.

PMID: 33370413
PMCID: PMC7853461
DOI: 10.1371/journal.ppat.1009181

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Michael J Mansfield et al. PLoS Pathog. 2020.

. 2020 Dec 28;16(12):e1009181.

doi: 10.1371/journal.ppat.1009181. eCollection 2020 Dec.

Authors

Michael J Mansfield^{1

2}, Benjamin J-M Tremblay¹, Ji Zeng^{3

4}, Xin Wei¹, Harold Hodgins¹, Jay Worley^{5

6}, Lynn Bry^{5

7}, Min Dong^{3

4}, Andrew C Doxey¹

Affiliations

¹ Department of Biology, David R. Cheriton School of Computer Science, and Waterloo Centre for Microbial Research, University of Waterloo, Waterloo, Ontario, Canada.
² Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan.
³ Department of Urology, Boston Children's Hospital, Boston, Massachusetts, United States of America.
⁴ Department of Microbiology, Harvard Medical School, Boston, Massachusetts, United States of America.
⁵ Massachusetts Host-Microbiome Center, Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.
⁶ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
⁷ Division of Infectious Diseases, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.

PMID: 33370413
PMCID: PMC7853461
DOI: 10.1371/journal.ppat.1009181

Abstract

Clostridioides difficile is the major worldwide cause of antibiotic-associated gastrointestinal infection. A pathogenicity locus (PaLoc) encoding one or two homologous toxins, toxin A (TcdA) and toxin B (TcdB), is essential for C. difficile pathogenicity. However, toxin sequence variation poses major challenges for the development of diagnostic assays, therapeutics, and vaccines. Here, we present a comprehensive phylogenomic analysis of 8,839 C. difficile strains and their toxins including 6,492 genomes that we assembled from the NCBI short read archive. A total of 5,175 tcdA and 8,022 tcdB genes clustered into 7 (A1-A7) and 12 (B1-B12) distinct subtypes, which form the basis of a new method for toxin-based subtyping of C. difficile. We developed a haplotype coloring algorithm to visualize amino acid variation across all toxin sequences, which revealed that TcdB has diversified through extensive homologous recombination throughout its entire sequence, and formed new subtypes through distinct recombination events. In contrast, TcdA varies mainly in the number of repeats in its C-terminal repetitive region, suggesting that recombination-mediated diversification of TcdB provides a selective advantage in C. difficile evolution. The application of toxin subtyping is then validated by classifying 351 C. difficile clinical isolates from Brigham and Women's Hospital in Boston, demonstrating its clinical utility. Subtyping partitions TcdB into binary functional and antigenic groups generated by intragenic recombinations, including two distinct cell-rounding phenotypes, whether recognizing frizzled proteins as receptors, and whether it can be efficiently neutralized by monoclonal antibody bezlotoxumab, the only FDA-approved therapeutic antibody. Our analysis also identifies eight universally conserved surface patches across the TcdB structure, representing ideal targets for developing broad-spectrum therapeutics. Finally, we established an open online database (DiffBase) as a central hub for collection and classification of C. difficile toxins, which will help clinicians decide on therapeutic strategies targeting specific toxin variants, and allow researchers to monitor the ongoing evolution and diversification of C. difficile.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Clustering of TcdA and TcdB sequences derived from NCBI GenBank and SRA into subtypes.**
(A) Hierarchical clustering of TcdA sequences, split into 8 groups. (B) Neighbor-joining phylogenetic tree of representative sequences of each TcdA subtype. (C) Percentage identities between representative sequences. (D) Hierarchical clustering of TcdB sequences, split into 14 groups. (E) Neighbor-joining phylogenetic tree of representative sequences of each TcdB subtype. (F) Percentage identities between representative sequences. Hierarchical clustering was performed using the hclust() function in R, and cluster definitions were selected based on strong within-cluster sequence similarities and weak between-cluster similarities, as demonstrated visually and quantitatively. The reference strains (VPI 10463 and strain 630) are associated with TcdA group A1 and TcdB group B1. The hypervirulent ribotype 027 strains such as R12087 and R20291 are associated with TcdA group A2 and TcdB group B2. Also included are the homologs of TcdA and TcdB (TcsH and TcsL, respectively) from P. *sordellii*, which expectedly exhibit the highest divergence from other groups. The datasets include TcdA and TcdB sequences from the NCBI GenBank as well as additional sequences assembled from the SRA.

**Fig 2. Toxin subtypes across the C. *difficile* phylogeny and occurrence of subtypes in a clinical CDI cohort.**
(A) TcdA (inner ring) and TcdB (outer ring) subtypes mapped onto a tree of 1934 C. *difficile* genomes. The tree is a maximum likelihood phylogeny of NCBI-derived *C. difficile* genomes based on 14,194 genome-wide SNPs (see Methods). Lineages corresponding to previously identified C. *difficile* PaLoc clades (1–5) are labeled numerically. Selected clinically relevant strains are shown on the tree, with hypervirulent/epidemic outbreak strains indicated by stars. Asterisks indicate lineages without toxin genes. (B) Frequency of toxin subtypes detected in 1,934 representative, complete C. *difficile* genomes from NCBI/GenBank. A total of 1,640 (84.8%) C. *difficile* strains contained TcdA and/or TcdB, while 294 (15.2%) were toxin deficient. (c) Frequency of toxin subtypes detected in a CDI clinical cohort from Brigham and Women's Hospital (BWH). The total dataset contained 351 C. *difficile* genomes derived from infected patients. Of these, 289 (82.3%) contained toxin genes, and 62 (17.7%) were toxin deficient.

**Fig 3. Evolutionary diversification of TcdB by intragenic recombination and domain shuffling.**
(A) Visualization of amino acid variation patterns in TcdB using a newly developed haplotype coloring algorithm (HaploColor). The visualization shows patterns of amino acid variation across the TcdB alignment. In this algorithm, the first sequence (B1.1) is assigned a distinct color, and all other sequences are colored the same color where they match this first sequence. Then, the process is repeated using a second sequence (B7.1) as the new reference, and so on. This reveals multiple colored segments indicative of common ancestry (identity by descent). Mosaic patterns are indicative of intragenic recombination. **(B)** Phylogenetic trees of TcdB based on individual domains. Each domain tree can be subdivided into two types (labeled 1 and 2), which allows each subtype to be described based on its domain composition (C). This reveals that TcdB subtypes are composed of domains with variable evolutionary histories, indicative of domain shuffling and intragenic recombination. (D) Evolutionary model depicting relationships between subtypes and putative recombination events. Here, TcdB split early into two main groups (i and ii). Subtype B2 likely originated by a recombination event fusing an ancestral type i and type ii toxin. B9 likely originated from recombination between B1 and B2, B3 from recombination between B1 and a type ii toxin, and B8 from recombination between B5 and a type ii toxin.

**Fig 4. Conservation and functional variation across TcdB subtypes.**
(A) Frequency of amino acid variants across all positions of TcdB. The height of the bar indicates the number of unique TcdB sequences that contain a substitution relative to the classical TcdB1 (B1.1) sequence from strain 630 and VPI10463. Below this is a plot of amino acid variation for key functional regions including the binding sites for the frizzled receptor (FZD) and the antibodies (E3, PA41, and bezlotoxumab). The alignment is colored gray for residues that match the common amino acid found in B1.1, and variants are colored blue (darkest blue = most common variant). E3 and PA41 binding sites are highly conserved, whereas FZD and bezlotoxumab binding sites are highly variable. FZD and bezlotoxumab variants also co-occur with each other. (B) Evolutionary conservation mapped to the protein structure of full length TcdB based on PDB 6OQ5 [65]. Eight highly conserved surface patches are indicated. Center residues within each surface patch are indicated in bold font.

See this image and copyright information in PMC

Cited by

Bacterial toxins induce non-canonical migracytosis to aggravate acute inflammation.
Li D, Yang Q, Luo J, Xu Y, Li J, Tao L. Li D, et al. Cell Discov. 2024 Nov 5;10(1):112. doi: 10.1038/s41421-024-00729-1. Cell Discov. 2024. PMID: 39500876 Free PMC article.
A multivalent mRNA-LNP vaccine protects against Clostridioides difficile infection.
Alameh MG, Semon A, Bayard NU, Pan YG, Dwivedi G, Knox J, Glover RC, Rangel PC, Tanes C, Bittinger K, She Q, Hu H, Bonam SR, Maslanka JR, Planet PJ, Moustafa AM, Davis B, Chevrier A, Beattie M, Ni H, Blizard G, Furth EE, Mach RH, Lavertu M, Sellmyer MA, Tam Y, Abt MC, Weissman D, Zackular JP. Alameh MG, et al. Science. 2024 Oct 4;386(6717):69-75. doi: 10.1126/science.adn4955. Epub 2024 Oct 3. Science. 2024. PMID: 39361752 Free PMC article.
Genomic and phenotypic studies among Clostridioides difficile isolates show a high prevalence of clade 2 and great diversity in clinical isolates from Mexican adults and children with healthcare-associated diarrhea.
Meléndez-Sánchez D, Hernández L, Ares M, Méndez Tenorio A, Flores-Luna L, Torres J, Camorlinga-Ponce M. Meléndez-Sánchez D, et al. Microbiol Spectr. 2024 Jul 2;12(7):e0394723. doi: 10.1128/spectrum.03947-23. Epub 2024 Jun 12. Microbiol Spectr. 2024. PMID: 38864670 Free PMC article.
Commensal-pathogen dynamics structure disease outcomes during Clostridioides difficile colonization.
Fishbein SRS, DeVeaux AL, Khanna S, Ferreiro AL, Liao J, Agee W, Ning J, Mahmud B, Wallace MJ, Hink T, Reske KA, Cass C, Guruge J, Leekha S, Rengarajan S, Dubberke ER, Dantas G. Fishbein SRS, et al. Cell Host Microbe. 2025 Jan 8;33(1):30-41.e6. doi: 10.1016/j.chom.2024.12.002. Epub 2024 Dec 27. Cell Host Microbe. 2025. PMID: 39731916
Against Clostridioides difficile Infection: An Update on Vaccine Development.
Wang J, Ma Q, Tian S. Wang J, et al. Toxins (Basel). 2025 May 1;17(5):222. doi: 10.3390/toxins17050222. Toxins (Basel). 2025. PMID: 40423305 Free PMC article. Review.

See all "Cited by" articles

References

1. Knight DR, Elliott B, Chang BJ, Perkins TT, Riley T V. Diversity and evolution in the genome of Clostridium difficile. Clin Microbiol Rev. 2015;28: 721–741. 10.1128/CMR.00127-14 - DOI - PMC - PubMed
1. Guh AY, Mu Y, Winston LG, Johnston H, Olson D, Farley MM, et al. Trends in U.S. burden of Clostridioides difficile infection and outcomes. N Engl J Med. 2020;382: 1320–1330. 10.1056/NEJMoa1910215 - DOI - PMC - PubMed
1. Heinlen L, Ballard JD. Clostridium difficile infection. Am J Med Sci. 2010;340: 247–252. 10.1097/MAJ.0b013e3181e939d8 - DOI - PMC - PubMed
1. Rupnik M, Wilcox MH, Gerding DN. Clostridium difficile infection: New developments in epidemiology and pathogenesis. Nat Rev Microbiol. 2009;7: 526–536. 10.1038/nrmicro2164 - DOI - PubMed
1. Martin JSH, Monaghan TM, Wilcox MH. Clostridium difficile infection: Epidemiology, diagnosis and understanding transmission. Nat Rev Gastroenterol Hepatol. 2016;13: 206–216. 10.1038/nrgastro.2016.25 - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- BacDive

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Affiliations

Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases