Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 6;4(4):e00211-19.
doi: 10.1128/mSystems.00211-19.

Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning

Affiliations

Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning

Finlay Maguire et al. mSystems. .

Abstract

Nontyphoidal Salmonella (NTS) is a leading global cause of bacterial foodborne morbidity and mortality. Our ability to treat severe NTS infections has been impaired by increasing antimicrobial resistance (AMR). To understand and mitigate the global health crisis AMR represents, we need to link the observed resistance phenotypes with their underlying genomic mechanisms. Broiler chickens represent a key reservoir and vector for NTS infections, but isolates from this setting have been characterized in only very low numbers relative to clinical isolates. In this study, we sequenced and assembled 97 genomes encompassing 7 serotypes isolated from broiler chicken in farms in British Columbia between 2005 and 2008. Through application of machine learning (ML) models to predict the observed AMR phenotype from this genomic data, we were able to generate highly (0.92 to 0.99) precise logistic regression models using known AMR gene annotations as features for 7 antibiotics (amoxicillin-clavulanic acid, ampicillin, cefoxitin, ceftiofur, ceftriaxone, streptomycin, and tetracycline). Similarly, we also trained "reference-free" k-mer-based set-covering machine phenotypic prediction models (0.91 to 1.0 precision) for these antibiotics. By combining the inferred k-mers and logistic regression weights, we identified the primary drivers of AMR for the 7 studied antibiotics in these isolates. With our research representing one of the largest studies of a diverse set of NTS isolates from broiler chicken, we can thus confirm that the AmpC-like CMY-2 β-lactamase is a primary driver of β-lactam resistance and that the phosphotransferases APH(6)-Id and APH(3″-Ib) are the principal drivers of streptomycin resistance in this important ecosystem.IMPORTANCE Antimicrobial resistance (AMR) represents an existential threat to the function of modern medicine. Genomics and machine learning methods are being increasingly used to analyze and predict AMR. This type of surveillance is very important to try to reduce the impact of AMR. Machine learning models are typically trained using genomic data, but the aspects of the genomes that they use to make predictions are rarely analyzed. In this work, we showed how, by using different types of machine learning models and performing this analysis, it is possible to identify the key genes underlying AMR in nontyphoidal Salmonella (NTS). NTS is among the leading cause of foodborne illness globally; however, AMR in NTS has not been heavily studied within the food chain itself. Therefore, in this work we performed a broad-scale analysis of the AMR in NTS isolates from commercial chicken farms and identified some priority AMR genes for surveillance.

Keywords: AMR prediction; Salmonella; antimicrobial resistance; food chain; genomics; machine learning.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Core genome SNP phylogeny. The figure shows the IQTree maximum likelihood phylogeny generated from core genome SNP alignment. Internal tree nodes with ≥90% ultrafast bootstrap support are noted by black circles. Correspondences of serotype clades to the lowest common ancestor of each are highlighted according to the following color scheme (as indicated by the legend): a blue background indicates S. Kentucky serovars, orange S. Hadar, green S. Heidelberg, red-orange S. I:4,[5],12:i:, purple S. Enteritidis, brown S. Typhimurium, and light pink S. Thompson (outgroup). A randomly chosen name representing the farm from which a sample was isolated is indicated in parentheses. AST results are indicated using circles, with resistance indicated by a white circle, intermediate resistance by a gray circle, and susceptibility by a black circle. Antibiotics are abbreviated per standard shorthand from the taxon label outward as follows: amoxicillin-clavulanic acid (AMOCLA), ampicillin (AMPICI), azithromycin (AZITHR), cefoxitin (CEFOXI), ceftiofur (CEFTIF), ceftriaxone (CEFTRI), chloramphenicol (CHLORA), ciprofloxacin (CIPROF), gentamicin (GENTAM), nalidixic acid (NALAC), streptomycin (STREPT), sulfamethoxazole (SULFIZ), tetracycline (TETRA), and trimethoprim-sulfamethoxazole (TRISUL). Taxa for which the AST was systematically predicted incorrectly are indicated in white characters.
FIG 2
FIG 2
All AMR genes detected by serotype under CARD’s “Strict” and “Perfect” criteria (including efflux system components). Each cell indicates the percentage of isolates belonging to the relevant serotype (column) that contained each detected determinant (rows). Black blocks indicate that no isolates of that serotype had that AMR gene, and cream blocks indicate that 100% of the isolates had that AMR gene. Serotypes and genes are each ordered via hierarchical clustering as indicated by the dendrograms.
FIG 3
FIG 3
FDA categorization of AST prediction performance across the antibiotics with sufficient numbers of susceptible and resistant isolates for assessment. “Categorical Agreement” represents the cases in which the prediction matched the observed phenotype, “Major Disagreement” corresponds to a prediction of resistance but a determination of susceptibility by the AST, and “Very Major Disagreement” indicates a prediction of susceptibility but a determination of resistance by the AST. (A) Performance of direct tallying of the presence of AMR genes as detected by RGI. (B) The same procedure was performed but with exclusion of efflux determinants. (C) Accuracy of prediction of resistance patterns by the use of binary logistic regression models trained using the AMR genes as features. (D) Accuracy of prediction of resistance directly from the genome by the use of a k-mer-based set-covering machine model.
FIG 4
FIG 4
Observed precision for direct tallying with and without efflux pumps and test set average classifier precision for set-covering machine and logistic regression models. These results clearly show that both machine learning approaches created far more precise predictions of AST (>0.9) than direct tallying of the AMR determinants.
FIG 5
FIG 5
A plot of the most important features and their identity for the machine learning models. (A) Learnt coefficients/weights on the AMR gene presence/absence matrix by the logistic regression models. Only weights greater than 1 are displayed on this plot. (B) Top genomic origins of the 31-mers learnt by the set-covering machines. Non-Gene Loci are the 31-mers mapping outside any of the identified genes.

Similar articles

Cited by

References

    1. Gardy JL, Loman NJ. 2018. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat Rev Genet 19:9. doi:10.1038/nrg.2017.88. - DOI - PMC - PubMed
    1. Robinson T, Bu D, Carrique-Mas J, Fèvre E, Gilbert M, Grace D, Hay S, Jiwakanon J, Kakkar M, Kariuki S, Laxminarayan R, Lubroth J, Magnusson U, Thi Ngoc P, van Boeckel TP, Woolhouse M. 2016. Antibiotic resistance is the quintessential One Health issue. Trans R Soc Trop Med Hyg 110:377–380. doi:10.1093/trstmh/trw048. - DOI - PMC - PubMed
    1. World Health Organization. 2015. Global action plan on antimicrobial resistance. World Health Organization, Geneva, Switzerland.
    1. Brown ED, Wright GD. 2016. Antibacterial drug discovery in the resistance era. Nature 529:336. doi:10.1038/nature17042. - DOI - PubMed
    1. Bradley P, Gordon NC, Walker TM, Dunn L, Heys S, Huang B, Earle S, Pankhurst LJ, Anson L, De Cesare M, Piazza P, Votintseva A, Golubchik T, Wilson D, Wyllie D, Diel R, Niemann S, Feuerriegel S, Kohl T, Ismail N, Omar S, Smith E, Buck D, McVean G, Walker A, Peto T, Crook D, Iqbal Z. 2015. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat Commun 6:10063. doi:10.1038/ncomms10063. - DOI - PMC - PubMed

LinkOut - more resources