Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov;89(5):106265.
doi: 10.1016/j.jinf.2024.106265. Epub 2024 Sep 7.

Machine learning to attribute the source of Campylobacter infections in the United States: A retrospective analysis of national surveillance data

Affiliations

Machine learning to attribute the source of Campylobacter infections in the United States: A retrospective analysis of national surveillance data

Ben Pascoe et al. J Infect. 2024 Nov.

Abstract

Objectives: Integrating pathogen genomic surveillance with bioinformatics can enhance public health responses by identifying risk and guiding interventions. This study focusses on the two predominant Campylobacter species, which are commonly found in the gut of birds and mammals and often infect humans via contaminated food. Rising incidence and antimicrobial resistance (AMR) are a global concern, and there is an urgent need to quantify the main routes to human infection.

Methods: During routine US national surveillance (2009-2019), 8856 Campylobacter genomes from human infections and 16,703 from possible sources were sequenced. Using machine learning and probabilistic models, we target genetic variation associated with host adaptation to attribute the source of human infections and estimate the importance of different disease reservoirs.

Results: Poultry was identified as the primary source of human infections, responsible for an estimated 68% of cases, followed by cattle (28%), and only a small contribution from wild birds (3%) and pork sources (1%). There was also evidence of an increase in multidrug resistance, particularly among isolates attributed to chickens.

Conclusions: National surveillance and source attribution can guide policy, and our study suggests that interventions targeting poultry will yield the greatest reductions in campylobacteriosis and spread of AMR in the US.

Data availability: All sequence reads were uploaded and shared on NCBI's Sequence Read Archive (SRA) associated with BioProjects; PRJNA239251 (CDC / PulseNet surveillance), PRJNA287430 (FSIS surveillance), PRJNA292668 & PRJNA292664 (NARMS) and PRJNA258022 (FDA surveillance). Publicly available genomes, including reference genomes and isolates sampled worldwide from wild birds are associated with BioProject accessions: PRJNA176480, PRJNA177352, PRJNA342755, PRJNA345429, PRJNA312235, PRJNA415188, PRJNA524300, PRJNA528879, PRJNA529798, PRJNA575343, PRJNA524315 and PRJNA689604. Contiguous assemblies of all genome sequences compared are available at Mendeley data (assembled C. coli genomes doi: 10.17632/gxswjvxyh3.1; assembled C. jejuni genomes doi: 10.17632/6ngsz3dtbd.1) and individual project and accession numbers can be found in Supplementary tables S1 and S2, which also includes pubMLST identifiers for assembled genomes. Figshare (10.6084/m9.figshare.20279928). Interactive phylogenies are hosted on microreact separately for C. jejuni (https://microreact.org/project/pascoe-us-cjejuni) and C. coli (https://microreact.org/project/pascoe-us-ccoli).

Keywords: Campylobacteriosis; Chicken consumption; Gastroenteritis; Machine learning; Source attribution.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest All authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1. Isolate genomes used in this study.
(A) Overview of the number of isolates from each source, including chicken (yellow; n = 9395), cattle (green; n = 5023), swine (pink; n = 1175), clinical cases (grey; n = 8856), environmental sources (blue; n = 248), turkey (orange; n = 497) and a global collection from birds (purple; n = 365). Samples were collected from all over the US, with states that contributed higher numbers of samples coloured more deeply on the maps. (B) Phylogeny of 18,306 C. jejuni genomes constructed using mashtree, coloured by source. The two large host generalist clonal complexes (ST-21 CC and ST-45 CC) are labelled along with the chicken-associated clonal complex. (C) Mash phylogeny of 7253 C. coli coloured by source. The large host generalist clonal complex, ST-828 CC and three ancestral clades (Clades 1, 2 and 3) are labelled along with the highly introgressed ST-1150 clonal complex.,,
Fig. 2
Fig. 2. Overview of C. jejuni attribution study.
(A) Random Forest analyses were performed on training data from chicken, cattle, and wild bird sources to score patterns of unitigs according to their ability to predict isolate source. Dots within each box show how common each pattern is in the two intersecting hosts. Patterns within the small, dotted boxes are common in the host on the horizontal axis, but rare in the host labelled on the vertical axis. (B) Patterns of unitigs with highly discriminatory mutual information (MI) scores were used to select markers for different hosts. (C) Genes from which unitig markers were selected and assigned allele numbers for attribution. (D) Markers were tested on a subset of the data for accuracy (overall accuracy > 90%); and (E) used to predict the source of 8160 C. jejuni infection cases.
Fig. 3
Fig. 3. Overview of C. coli attribution study.
(A) Random Forest analyses were performed on training data from chicken, cattle, turkey, and pig sources to score patterns of unitigs according to their ability to predict isolate source. Dots within each box show how common each pattern is in the two intersecting hosts. Patterns within the small, dotted boxes are common in the host on the horizontal axis, but rare in the host labelled on the vertical axis. Patterns of unitigs with highly discriminatory mutual information (MI) scores were used to select markers for different hosts. (C) Genes from which unitig markers were selected and assigned allele numbers for attribution. (D) Markers were tested on a subset of the data for accuracy (overall accuracy > 90%). (E) Chicken and turkey markers were combined to predict the source of 696 C. coli infection cases from poultry, cattle, and pigs.
Fig. 4
Fig. 4. Rise in chicken-associated campylobacteriosis and multi-drug resistance in the US.
(A) Heat map of the rise in multidrug resistant Campylobacter across the US since 2009. Darker red colouring indicates increasing rates of MDR isolates in 2018 compared to 2009. Yellow fractions of pie charts reflect the proportion of isolates from that HSS health region that was predicted to be from chicken sources in our attribution study. Pie chart radius reflects incidence per 100,000 people in 2018 (FoodNet Fast tool: https://wwwn.cdc.gov/foodnetfast/). (B) Plot of rising campylobacteriosis incidence (left axis; black line), driven by an increase in the proportion of fluoroquinolone resistant isolates (right axis; purple line).

References

    1. Collier SA, Deng L, Adam EA, Benedict KM, Beshearse EM, Blackstock AJ, et al. Estimate of burden and direct healthcare cost of infectious waterborne disease in the United States. Emerg Infect Dis. 2021;27:140–9. doi: 10.3201/eid2701.190676. - DOI - PMC - PubMed
    1. Collins JP, Shah HJ, Weller DL, Ray LC, Smith K, McGuire S, et al. Preliminary incidence and trends of infections caused by pathogens transmitted commonly through food – Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 2016–2021. MMWR Morb Mortal Wkly Rep. 2022;71:1260–4. - PMC - PubMed
    1. Gu W, Dutta V, Patrick M, Bruce BB, Geissler A, Huang J, et al. Statistical adjustment of culture-independent diagnostic tests for trend analysis in the Foodborne Diseases Active Surveillance Network (FoodNet), USA. Int J Epidemiol. 2018;47:1613–22. - PubMed
    1. Geissler AL, Bustos Carrillo F, Swanson K, Patrick ME, Fullerton KE, Bennett C, et al. Increasing Campylobacter infections, outbreaks, and antimicrobial resistance in the United States, 2004–2012. Clin Infect Dis. 2017;65:1624–31. - PubMed
    1. Peters S, Pascoe B, Wu Z, Bayliss SC, Zeng X, Edwinson A, et al. Campylobacter jejuni genotypes are associated with post-infection irritable bowel syndrome in humans. Commun Biol. 2021;4:1015. doi: 10.1038/s42003-021-02554-8. - DOI - PMC - PubMed