Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May;31(13):117-128.
doi: 10.3201/eid3113.241950.

Lessons from 5 Years of Routine Whole-Genome Sequencing for Epidemiologic Surveillance of Shiga Toxin-Producing Escherichia coli, France, 2018-2022

Lessons from 5 Years of Routine Whole-Genome Sequencing for Epidemiologic Surveillance of Shiga Toxin-Producing Escherichia coli, France, 2018-2022

Gabrielle Jones et al. Emerg Infect Dis. 2025 May.

Abstract

Whole-genome sequencing (WGS) is routine for surveillance of Shiga toxin-producing Escherichia coli human isolates in France. Protocols use EnteroBase hierarchical clustering at <5 allelic differences (HC5) as screening for cluster detection. We assessed current implementation after 5 years for 1,002 sequenced isolates. From genomic distances of serotypes O26:H11, O157:H7, O80:H2, and O103:H2, we determined statistical thresholds for cluster determination and compared those with HC5 clusters. Thresholds varied by serotype, 5-16 allelic distances and 15-20 single-nucleotide polymorphisms, showing limits of a single-threshold approach. We confirmed validity of HC5 screening for 3 serotypes because statistical thresholds had limited effect on isolate clustering (high sensitivity and specificity). For O80:H2, results suggest that HC5 is less reliable, and other approaches should be explored. Public health officials should regularly assess WGS used for Shiga toxin-producing E. coli surveillance to account for serotype and genomic evolution and to interpret WGS-linked isolates in light of epidemiologic data.

Keywords: France; STEC; Shiga toxin–producing Escherichia coli; bacteria; cluster detection; epidemiologic surveillance; hierarchical clustering; whole-genome sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Characteristics from 5 years of routine whole-genome sequencing for epidemiologic surveillance of Shiga toxin–producing Escherichia coli, France, 2018–2022. A) Distribution of pairwise allelic distances; B) SNP distances, censured at 50. Shiga toxin–producing Escherichia coli serotypes are shown for each panel. SNP, single-nucleotide polymorphism.
Figure 2
Figure 2
Mixture of distributions model applied to allelic distance data from 5 years of routine whole-genome sequencing for epidemiologic surveillance of Shiga toxin–producing Escherichia coli, France, 2018–2022. A) Number of components fit to the data distribution; B) threshold represented as the probability of belonging to the first distribution. Shiga toxin–producing Escherichia coli serotypes are shown for each panel. Black line indicates global estimated density; black circles, probability of belonging to first distribution for each observed allelic or single-nucleotide polymorphism distance; red line, largest allelic or single-nucleotide polymorphism distance that has a 50% probability of belonging to the first distribution. Comp, component.
Figure 3
Figure 3
Mixture of distributions model applied to SNP distance data from 5 years of routine whole-genome sequencing for epidemiologic surveillance of Shiga toxin–producing Escherichia coli, France, 2018–2022. A) Number of components fit to the data distribution; B) threshold represented as a probability of belonging to the first or second distribution. Shiga toxin–producing Escherichia coli serotypes are shown for each panel. Comp, component; SNP, single-nucleotide polymorphism.
Figure 4
Figure 4
Regression from hierarchical clustering at a threshold of 5 allelic differences from 5 years of routine whole-genome sequencing for epidemiologic surveillance of Shiga toxin–producing Escherichia coli, France, 2018–2022. A) Allelic distance; B) SNP distance. Distances calculated as a function of time in days by multivariable fractional polynomial linear regression. Black circles indicate estimated allelic or SNP distance for each observed temporal distance in days; blue, red, green, and black vertical lines, 95% CIs of the estimated genomic distances for each observed temporal distance in days. SNP, single-nucleotide polymorphism.
Figure 5
Figure 5
Single-nucleotide polymorphism–based maximum likelihood phylogenetic tree of 226 080:H2 isolates from 5 years of routine whole-genome sequencing for epidemiologic surveillance of Shiga toxin–producing Escherichia coli, France, 2018–2022. Tree was built based on the sequence alignment of 3,949 single-nucleotide variant sites of the recombination-free core genome of E. coli strain MOD1-EC6881 (GenBank accession no. GCF_002520045.1). Tree was midpoint-rooted and visualized with iTOL (https://itol.embl.de). Bootstrap support values >90% are indicated with red dots on the branches. Branch lengths and corresponding scale bar indicate numbers of single-nucleotide polymorphisms per base of the final alignment. HC5, hierarchical clustering at a threshold of 5 allelic differences.

References

    1. Bruyand M, Mariani-Kurkdjian P, Le Hello S, King LA, Van Cauteren D, Lefevre S, et al. ; Réseau français hospitalier de surveillance du SHU pédiatrique. Paediatric haemolytic uraemic syndrome related to Shiga toxin-producing Escherichia coli, an overview of 10 years of surveillance in France, 2007 to 2016. Euro Surveill. 2019;24:1800068. 10.2807/1560-7917.ES.2019.24.8.1800068 - DOI - PMC - PubMed
    1. Lipman DJ, Cherry JL, Strain E, Agarwala R, Musser SM. Genomic perspectives on foodborne illness. Proc Natl Acad Sci U S A. 2024;121:e2411894121. 10.1073/pnas.2411894121 - DOI - PMC - PubMed
    1. Jones G, Mariani-Kurkdjian P, Cointe A, Bonacorsi S, Lefèvre S, Weill FX, et al. Sporadic Shiga toxin–producing Escherichia coli–associated pediatric hemolytic uremic syndrome, France, 2012–2021. Emerg Infect Dis. 2023;29:2054–64. 10.3201/eid2910.230382 - DOI - PMC - PubMed
    1. Joseph A, Cointe A, Mariani Kurkdjian P, Rafat C, Hertig A. Shiga toxin–associated hemolytic uremic syndrome: a narrative review. Toxins (Basel). 2020;12:67. 10.3390/toxins12020067 - DOI - PMC - PubMed
    1. Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. 2018;24:335–41. 10.1016/j.cmi.2017.10.013 - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources