Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 12;62(6):e0057023.
doi: 10.1128/jcm.00570-23. Epub 2024 Apr 24.

Construction of the ETECFinder database for the characterization of enterotoxigenic Escherichia coli (ETEC) and revision of the VirulenceFinder web tool at the CGE website

Affiliations

Construction of the ETECFinder database for the characterization of enterotoxigenic Escherichia coli (ETEC) and revision of the VirulenceFinder web tool at the CGE website

Flemming Scheutz et al. J Clin Microbiol. .

Abstract

The identification of pathogens is essential for effective surveillance and outbreak detection, which lately has been facilitated by the decreasing cost of whole-genome sequencing (WGS). However, extracting relevant virulence genes from WGS data remains a challenge. In this study, we developed a web-based tool to predict virulence-associated genes in enterotoxigenic Escherichia coli (ETEC), which is a major concern for human and animal health. The database includes genes encoding the heat-labile toxin (LT) (eltA and eltB), heat-stable toxin (ST) (est), colonization factors CS1 through 30, F4, F5, F6, F17, F18, and F41, as well as toxigenic invasion and adherence loci (tia, tibAC, etpBAC, eatA, yghJ, and tleA). To construct the database, we revised the existing ETEC nomenclature and used the VirulenceFinder webtool at the CGE website [VirulenceFinder 2.0 (dtu.dk)]. The database was tested on 1,083 preassembled ETEC genomes, two BioProjects (PRJNA421191 with 305 and PRJNA416134 with 134 sequences), and the ETEC reference genome H10407. In total, 455 new virulence gene alleles were added, 50 alleles were replaced or renamed, and two were removed. Overall, our tool has the potential to greatly facilitate ETEC identification and improve the accuracy of WGS analysis. It can also help identify potential new virulence genes in ETEC. The revised nomenclature and expanded gene repertoire provide a better understanding of the genetic diversity of ETEC. Additionally, the user-friendly interface makes it accessible to users with limited bioinformatics experience.

Importance: Detecting colonization factors in enterotoxigenic Escherichia coli (ETEC) is challenging due to their large number, heterogeneity, and lack of standardized tests. Therefore, it is important to include these ETEC-related genes in a more comprehensive VirulenceFinder database in order to obtain a more complete coverage of the virulence gene repertoire of pathogenic types of E. coli. ETEC vaccines are of great importance due to the severity of the infections, primarily in children. A tool such as this could assist in the surveillance of ETEC in order to determine the prevalence of relevant types in different parts of the world, allowing vaccine developers to target the most prevalent types and, thus, a more effective vaccine.

Keywords: CGE website; ETEC virulence genes; WGS tool; curated database; enterotoxigenic E. coli.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Maximum parsimony tree of 31 LTI reference nucleotide sequences, three new LTI types (LTI-31, LTI-32, and LTI-33), and four new LT variants (LTIh-12b, LTIh-15b, LTIh-18b, and LTIh-18c) identified in 890 ETEC genomes analyzed with the revised VirulenceFinder database. The colors and groups I–V correspond to the designation and colors in Joffré et al. (13), where concatenated protein sequences were presented. Four nucleotides “TGAA” were removed from the NCBI-submitted sequences as described in the text. Only branch lengths larger than 1.00 are shown.
Fig 2
Fig 2
Maximum parsimony tree of 31 LTI and 15 LTII reference nucleotide sequences, three new LTI types (LTI-31, LTI-32, and LTI-33), and four new LTI variants (LTIh-12b, LTIh-15b, LTIh-18b, and LTIh-18c) identified in 890 ETEC genomes analyzed with the revised VirulenceFinder database. LTII-a–LTII-e sequences were 48.6%–54.3% identical to LTI sequences. Only branch lengths of 15.00 or larger are shown. For a high resolution of eltI, see Fig. 1.
Fig 3
Fig 3
ETECFinder results for an enterotoxigenic Escherichia coli (strain M2, ID 31919_4_289) isolate in the short output format using the revised VirulenceFinder database with FASTQ files from the same enterotoxigenic Escherichia coli as in Fig. 4. Multiple new virulence gene alleles are identified. Shown are the names of the best-matching allele in the VirulenceFinder, the percentage of nucleotides that are identical to the best-matching allele in the database and the corresponding sequence in the genome (percent identity), the length of the alignment between the best-matching allele in the database and the corresponding sequence in the genome [also called the high-scoring segment pair (HSP)], the length of the best-matching allele in the database, the name and function of the best-matching allele, and an LT type. Color indications: the dark green color indicates a perfect match for a given gene. The percent identity is 100%, and the sequence in the genome covers the entire length of the virulence gene in the database. The light green color indicates a warning due to a non-perfect match, percent identity < 100%, HSP length = virulence gene length. The gray color indicates a warning due to a non-perfect match, HSP length is shorter than the virulence gene length, percent identity = 100%. The red color indicates that no virulence gene with a match over the given threshold was found. See VirulenceFinder 2.0 output (dtu.dk).
Fig 4
Fig 4
ETECFinder results for an enterotoxigenic Escherichia coli isolate in the short output format using the revised VirulenceFinder database with a preassembled genome of the same enterotoxigenic Escherichia coli genome as in Fig. 3.

Similar articles

Cited by

References

    1. Qadri F, Svennerholm A-M, Faruque ASG, Sack RB. 2005. Enterotoxigenic Escherichia coli in developing countries: epidemiology, microbiology, clinical features, treatment, and prevention. Clin Microbiol Rev 18:465–483. doi:10.1128/CMR.18.3.465-483.2005 - DOI - PMC - PubMed
    1. Havelaar AH, Kirk MD, Torgerson PR, Gibb HJ, Hald T, Lake RJ, Praet N, Bellinger DC, de Silva NR, Gargouri N, Speybroeck N, Cawthorne A, Mathers C, Stein C, Angulo FJ, Devleesschauwer B, World Health Organization Foodborne Disease Burden Epidemiology Reference Group . 2015. World health organization global estimates and regional comparisons of the burden of foodborne disease in 2010. PLoS Med 12:e1001923. doi:10.1371/journal.pmed.1001923 - DOI - PMC - PubMed
    1. Kotloff KL, Nataro JP, Blackwelder WC, Nasrin D, Farag TH, Panchalingam S, Wu Y, Sow SO, Sur D, Breiman RF, et al. . 2013. Burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the global enteric multicenter study, GEMS): a prospective, case-control study. Lancet 382:209–222. doi:10.1016/S0140-6736(13)60844-2 - DOI - PubMed
    1. Dubreuil JD, Isaacson RE, Schifferli DM. 2016. Animal enterotoxigenic Escherichia coli. EcoSal Plus 7. doi:10.1128/ecosalplus.ESP-0006-2016 - DOI - PMC - PubMed
    1. Murphy D, Ricci A, Auce Z, Beechinor JG, Bergendahl H, Breathnach R, Bureš J, Duarte Da Silva JP, Hederová J, Hekman P, et al. . 2017. EMA and EFSA joint scientific opinion on measures to reduce the need to use antimicrobial agents in animal husbandry in the European Union, and the resulting impacts on food safety (RONAFA). EFSA J 15:e04666. doi:10.2903/j.efsa.2017.4666 - DOI - PMC - PubMed

MeSH terms