Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 15;60(6):e0031522.
doi: 10.1128/jcm.00315-22. Epub 2022 May 9.

Annotated Whole-Genome Multilocus Sequence Typing Schema for Scalable High-Resolution Typing of Streptococcus pyogenes

Affiliations

Annotated Whole-Genome Multilocus Sequence Typing Schema for Scalable High-Resolution Typing of Streptococcus pyogenes

A Friães et al. J Clin Microbiol. .

Abstract

Streptococcus pyogenes is a major human pathogen with high genetic diversity, largely created by recombination and horizontal gene transfer, making it difficult to use single nucleotide polymorphism (SNP)-based genome-wide analyses for surveillance. Using a gene-by-gene approach on 208 complete genomes of S. pyogenes, a novel whole-genome multilocus sequence typing (wgMLST) schema was developed, comprising 3,044 target loci. The schema was used for core-genome MLST (cgMLST) analyses of previously published data sets and 265 newly sequenced draft genomes with other molecular and phenotypic typing data. Clustering based on cgMLST data supported the genetic heterogeneity of many emm types and correlated poorly with pulsed-field gel electrophoresis macrorestriction profiling, superantigen gene profiling, and MLST sequence type, highlighting the limitations of older typing methods. While 763 loci were present in all isolates of a data set representative of S. pyogenes genetic diversity, the proposed schema allows scalable cgMLST analysis, which can include more loci for an increased resolution when typing closely related isolates. The cgMLST and PopPUNK clusters were broadly consistent in this diverse population. The cgMLST analyses presented results comparable to those of SNP-based methods in the identification of two recently emerged sublineages of emm1 and emm89 and the clarification of the genetic relatedness among isolates recovered in outbreak contexts. The schema was thoroughly annotated and made publicly available on the chewie-NS online platform (https://chewbbaca.online/species/1/schemas/1), providing a framework for high-resolution typing and analyzing the genetic variability of loci of particular biological interest.

Keywords: Streptococcus pyogenes; bioinformatics; genomics; group A Streptococcus; molecular epidemiology; molecular subtyping; outbreak; population genetics; surveillance studies; typing.

PubMed Disclaimer

Conflict of interest statement

The authors declare a conflict of interest. J.M.-C. received research grants administered through his university and received honoraria for serving on the speakers bureaus of Pfizer and Merck Sharp and Dohme. M.R. received honoraria for serving on the speakers bureau of Pfizer and Merck Sharp and Dohme and for serving in expert panels of GlaxoSmithKline and Merck Sharp and Dohme. All other authors declare no conflict of interest.

Figures

FIG 1
FIG 1
Minimum-spanning tree generated with the goeBURST algorithm for the cgMLST-100 profiles of 265 S. pyogenes isolates recovered in Portugal (see Data Set 1 in reference 31). The size of each node is proportional to the number of isolates with that particular cgMLST-100 profile on a logarithmic scale. Nodes are colored according to emm type. Link distances of ≥1,000 allelic differences are labeled (from a total of 1,230 compared loci).
FIG 2
FIG 2
Minimum-spanning tree generated with the goeBURST algorithm for the cgMLST profiles of 2,006 genetically diverse S. pyogenes isolates recovered worldwide (19) (see Data Set 2 in reference 31). The size of each node is proportional to the number of isolates with that particular cgMLST profile on a logarithmic scale. Nodes are colored according to emm type. Groups of clustered emm types represented by >30 isolates are highlighted inside rectangles and labeled with the respective emm types and PopPUNK (PP) phylogroup numbers (for simplicity, isolated nodes of emm types 4, 22, 44, 65, 75, 77, 81, and 92 are not highlighted). A total of 763 core loci were compared.
FIG 3
FIG 3
Box-and-whisker plots for the pairwise distances of the assemblies from Data Set 2 (19, 31) included in each emm type with ≥10 isolates (A) or in each PopPUNK phylogroup with ≥10 isolates (B). The distances were calculated based on the allele call results for the 763 cgMLST-100 loci of the 2,006 assemblies (interactive versions of these plots are available as supplemental material in reference 31).
FIG 4
FIG 4
Minimum-spanning tree generated with the goeBURST algorithm for the cgMLST-100 profiles of 119 outbreak S. pyogenes isolates recovered in the United Kingdom (18) (see Data Set 3 in reference 31). The size of each node is proportional to the number of isolates with that particular cgMLST profile on a logarithmic scale. The nodes are colored according to the emm type, and the outer ring is colored according to the outbreak number. Link distances are labeled as the number of allelic differences between nodes (from a total of 1,263 compared loci).
FIG 5
FIG 5
Graph representation of the relationships between the cgMLST-100 profiles of 135 noninvasive emm1 isolates recovered in the United Kingdom (11) and reference strain MGAS5005 (see Data Set 4 in reference 31), depicting all links with ≤19 allelic differences (from a total of 1,404 compared loci). The size of each node is proportional to the number of isolates with that particular cgMLST-100 profile on a logarithmic scale. Nodes are colored according to the M1 lineage, with MGAS5005 (reference genome for the M1global lineage) in green. Links that would not be present in the standard MST are shown in green. Links shown in black represent the MST links and may represent distances with >19 allelic differences.
FIG 6
FIG 6
Graph representation of the relationships between the cgMLST-100 profiles of 201 emm89 isolates (see Data Set 5 in reference 31) depicting all links with ≤55 allelic differences (from a total of 1,279 compared loci). The size of each node is proportional to the number of isolates with that particular cgMLST-100 profile on a logarithmic scale. Nodes are colored according to the variant of the nga promoter (Pnga). Links that would not be present in the standard MST are shown in green. Links shown in black represent the MST links and may represent distances with >55 allelic differences (labeled links).

References

    1. Carapetis JR, Steer AC, Mulholland EK, Weber M. 2005. The global burden of group A streptococcal diseases. Lancet Infect Dis 5:685–694. 10.1016/S1473-3099(05)70267-X. - DOI - PubMed
    1. Vekemans J, Gouvea-Reis F, Kim JH, Excler J-L, Smeesters PR, O’Brien KL, Van Beneden CA, Steer AC, Carapetis JR, Kaslow DC. 2019. The path to group A Streptococcus vaccines: World Health Organization research and development technology roadmap and preferred product characteristics. Clin Infect Dis 69:877–883. 10.1093/cid/ciy1143. - DOI - PMC - PubMed
    1. Beall B, Facklam R, Thompson T. 1996. Sequencing emm-specific PCR products for routine and accurate typing of group A streptococci. J Clin Microbiol 34:953–958. 10.1128/jcm.34.4.953-958.1996. - DOI - PMC - PubMed
    1. Carriço JA, Silva-Costa C, Melo-Cristino J, Pinto FR, de Lencastre H, Almeida JS, Ramirez M. 2006. Illustration of a common framework for relating multiple typing methods by application to macrolide-resistant Streptococcus pyogenes. J Clin Microbiol 44:2524–2532. 10.1128/JCM.02536-05. - DOI - PMC - PubMed
    1. Friães A, Pinto FR, Silva-Costa C, Ramirez M, Melo-Cristino J. 2013. Superantigen gene complement of Streptococcus pyogenes—relationship with other typing methods and short-term stability. Eur J Clin Microbiol Infect Dis 32:115–125. 10.1007/s10096-012-1726-3. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources