Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 3;12(9):e0053724.
doi: 10.1128/spectrum.00537-24. Epub 2024 Jul 25.

Multilocus sequence typing database for Streptococcus agalactiae contains a spurious allele of the transketolase gene

Affiliations

Multilocus sequence typing database for Streptococcus agalactiae contains a spurious allele of the transketolase gene

Swaine L Chen et al. Microbiol Spectr. .

Abstract

The tkt (transketolase) gene is one of the seven gene fragments used in the multilocus sequence typing (MLST) system for Streptococcus agalactiae. We discovered that the tkt_134 allele is derived from a homologous gene (which we designate tktX) that is not present in all S. agalactiae; all known strains that contain a match to the tkt_134 allele also contain a gene sequence that is much closer in sequence identity to the other non-tkt_134 alleles (i.e., the canonical tkt gene) in the database. Based on these data, the tkt_134 allele has been removed from the MLST database as of September 2021, and all sequence types containing tkt_134 have also been removed.IMPORTANCEMultilocus sequence typing (MLST) databases are a common good and remain important for research, medical, and epidemiological purposes. This remains true even in the context of widespread whole-genome sequencing. We discovered a contaminating allele of the tkt gene in the S. agalactiae MLST database that led to unstable, ambiguous, or erroneous MLST assignment. The allele has since been removed from the public database based on the results presented in this manuscript.

Keywords: GBS; MLST; S. agalactiae; molecular epidemiology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Unrooted approximately maximum likelihood phylogenetic tree of the 151 tkt alleles in the PubMLST database as of August 2021. The alignment is over 480 nt with no gaps. The scale bar is indicated at the bottom; furthermore, a branch representing a single SNP difference is indicated by the black arrow. The tip corresponding to tkt_134 is labeled, and the number of SNPs in the branch leading to tkt_134 is indicated.
Fig 2
Fig 2
Genome comparisons of the canonical tkt and the tktX loci. A reference genetic map, which is used in common in all parts of the figure, is shown using white boxes to represent genes; boxes above the line indicate the coding strand proceeds from left to right on this diagram, while boxes below indicate the opposite orientation. Common gene names/annotations are indicated above each white box. (A) Corresponding systematic gene identifiers (numbers only) are indicated for three reference genomes (prefixes are indicated in parentheses, i.e., the systematic name for the first A909 gene indicated is SAK_RS01730). Gene coordinates from the A909 genome are shown just below the genetic map for reference. Short gray bars indicate the regions that are homologous to the tkt gene region used for MLST. (B) CCUG 28551 genome analysis. Thick black bars indicate blastn matches to the corresponding A909 sequence [using the common genetic map and coordinates from (A)] for the published NCBI assembly and a de novo assembly generated from the raw Illumina reads (“Reassembled”). Coverage maps generated by aligning the raw Illumina reads to the A909 genome are shown at the bottom, again using the same genomic coordinates as in (A). All indications of sequence homology are >99% by blastn for individual genes in (A) as well as larger chromosomal segments in (B).
Fig 3
Fig 3
PCR on genomic DNA using existing primers recommended for amplifying tkt yields a single band regardless of the presence of tktX. Lane 5 (counting from the left) is a 100-bp DNA size ladder, and other lanes are PCRs from the indicated strains. The four strains to the left of the ladder are predicted to carry tktX, while the two on the right are not.

References

    1. Jolley KA, Maiden MCJ. 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595. doi: 10.1186/1471-2105-11-595 - DOI - PMC - PubMed
    1. Maiden MCJ. 2006. Multilocus sequence typing of bacteria. Annu Rev Microbiol 60:561–588. doi: 10.1146/annurev.micro.59.030804.121325 - DOI - PubMed
    1. Nicolas-Chanoine M-H, Blanco J, Leflon-Guibout V, Demarty R, Alonso MP, Caniça MM, Park Y-J, Lavigne J-P, Pitout J, Johnson JR. 2008. Intercontinental emergence of Escherichia coli clone O25:H4-ST131 producing CTX-M-15. J Antimicrob Chemother 61:273–281. doi: 10.1093/jac/dkm464 - DOI - PubMed
    1. Lau SH, Reddy S, Cheesbrough J, Bolton FJ, Willshaw G, Cheasty T, Fox AJ, Upton M. 2008. Major uropathogenic Escherichia coli strain isolated in the northwest of England identified by multilocus sequence typing. J Clin Microbiol 46:1076–1080. doi: 10.1128/JCM.02065-07 - DOI - PMC - PubMed
    1. Wu Z, Sippy R, Sahin O, Plummer P, Vidal A, Newell D, Zhang Q. 2014. Genetic diversity and antimicrobial susceptibility of Campylobacter jejuni isolates associated with sheep abortion in the United States and Great Britain. J Clin Microbiol 52:1853–1861. doi: 10.1128/JCM.00355-14 - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources