Validation strategy of a bioinformatics whole genome sequencing workflow for Shiga toxin-producing Escherichia coli using a reference collection extensively characterized with conventional methods
- PMID: 33656437
- PMCID: PMC8190621
- DOI: 10.1099/mgen.0.000531
Validation strategy of a bioinformatics whole genome sequencing workflow for Shiga toxin-producing Escherichia coli using a reference collection extensively characterized with conventional methods
Abstract
Whole genome sequencing (WGS) enables complete characterization of bacterial pathogenic isolates at single nucleotide resolution, making it the ultimate tool for routine surveillance and outbreak investigation. The lack of standardization, and the variation regarding bioinformatics workflows and parameters, however, complicates interoperability among (inter)national laboratories. We present a validation strategy applied to a bioinformatics workflow for Illumina data that performs complete characterization of Shiga toxin-producing Escherichia coli (STEC) isolates including antimicrobial resistance prediction, virulence gene detection, serotype prediction, plasmid replicon detection and sequence typing. The workflow supports three commonly used bioinformatics approaches for the detection of genes and alleles: alignment with blast+, kmer-based read mapping with KMA, and direct read mapping with SRST2. A collection of 131 STEC isolates collected from food and human sources, extensively characterized with conventional molecular methods, was used as a validation dataset. Using a validation strategy specifically adopted to WGS, we demonstrated high performance with repeatability, reproducibility, accuracy, precision, sensitivity and specificity above 95 % for the majority of all assays. The WGS workflow is publicly available as a 'push-button' pipeline at https://galaxy.sciensano.be. Our validation strategy and accompanying reference dataset consisting of both conventional and WGS data can be used for characterizing the performance of various bioinformatics workflows and assays, facilitating interoperability between laboratories with different WGS and bioinformatics set-ups.
Keywords: Escherichia coli; STEC, foodborne pathogens; public health; validation; whole genome sequencing.
Conflict of interest statement
The authors declare that there are no conflicts of interest.
Figures



Similar articles
-
Validation of Whole-Genome Sequencing for Identification and Characterization of Shiga Toxin-Producing Escherichia coli To Produce Standardized Data To Enable Data Sharing.J Clin Microbiol. 2018 Feb 22;56(3):e01388-17. doi: 10.1128/JCM.01388-17. Print 2018 Mar. J Clin Microbiol. 2018. PMID: 29263202 Free PMC article.
-
Impact of whole genome sequencing on the investigation of food-borne outbreaks of Shiga toxin-producing Escherichia coli serogroup O157:H7, England, 2013 to 2017.Euro Surveill. 2019 Jan;24(4):1800346. doi: 10.2807/1560-7917.ES.2019.24.4.1800346. Euro Surveill. 2019. PMID: 30696532 Free PMC article.
-
Evaluation of whole-genome sequencing for outbreak detection of Verotoxigenic Escherichia coli O157:H7 from the Canadian perspective.BMC Genomics. 2018 Dec 4;19(1):870. doi: 10.1186/s12864-018-5243-3. BMC Genomics. 2018. PMID: 30514209 Free PMC article.
-
Exploiting the explosion of information associated with whole genome sequencing to tackle Shiga toxin-producing Escherichia coli (STEC) in global food production systems.Int J Food Microbiol. 2014 Sep 18;187:57-72. doi: 10.1016/j.ijfoodmicro.2014.07.002. Epub 2014 Jul 11. Int J Food Microbiol. 2014. PMID: 25051454 Review.
-
Characteristics of the Shiga-toxin-producing enteroaggregative Escherichia coli O104:H4 German outbreak strain and of STEC strains isolated in Spain.Int Microbiol. 2011 Sep;14(3):121-41. doi: 10.2436/20.1501.01.142. Int Microbiol. 2011. PMID: 22101411 Review.
Cited by
-
Whole-Genome Sequencing-Based Screening of MRSA in Patients and Healthcare Workers in Public Hospitals in Benin.Microorganisms. 2023 Jul 31;11(8):1954. doi: 10.3390/microorganisms11081954. Microorganisms. 2023. PMID: 37630513 Free PMC article.
-
First Isolation of the Heteropathotype Shiga Toxin-Producing and Extra-Intestinal Pathogenic (STEC-ExPEC) E. coli O80:H2 in French Healthy Cattle: Genomic Characterization and Phylogenetic Position.Int J Mol Sci. 2024 May 16;25(10):5428. doi: 10.3390/ijms25105428. Int J Mol Sci. 2024. PMID: 38791466 Free PMC article.
-
Beyond clinical genomics: addressing critical gaps in One Health AMR surveillance.Front Microbiol. 2025 Apr 28;16:1596720. doi: 10.3389/fmicb.2025.1596720. eCollection 2025. Front Microbiol. 2025. PMID: 40356662 Free PMC article.
-
Species-Specific Quality Control, Assembly and Contamination Detection in Microbial Isolate Sequences with AQUAMIS.Genes (Basel). 2021 Apr 26;12(5):644. doi: 10.3390/genes12050644. Genes (Basel). 2021. PMID: 33926025 Free PMC article.
-
Risk Factor Analysis for Occurrence of Linezolid-Resistant Bacteria in the Digestive and Respiratory Tract of Food-Producing Animals in Belgium: A Pilot Study.Antibiotics (Basel). 2024 Jul 29;13(8):707. doi: 10.3390/antibiotics13080707. Antibiotics (Basel). 2024. PMID: 39200007 Free PMC article.
References
-
- Carriço JA, Sabat AJ, Friedrich AW, Ramirez M. Bioinformatics in bacterial molecular epidemiology and public health: databases, tools and the next-generation sequencing revolution, on behalf of the ESCMID Study Group for Epidemiological Markers (ESGEM) Eurosurveillance. 2013;18:1–9. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials