Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
- PMID: 32119670
- PMCID: PMC7067475
- DOI: 10.1371/journal.pcbi.1007608
Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
Abstract
The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here, we present a generalizable machine learning workflow for identifying genetic features driving AMR based on constructing reference strain-agnostic pan-genomes and training random subspace ensembles (RSEs). This workflow was applied to the resistance profiles of 14 antimicrobials across three urgent threat pathogens encompassing 288 Staphylococcus aureus, 456 Pseudomonas aeruginosa, and 1588 Escherichia coli genomes. We find that feature selection by RSE detects known AMR associations more reliably than common statistical tests and previous ensemble approaches, identifying a total of 45 known AMR-conferring genes and alleles across the three organisms, as well as 25 candidate associations backed by domain-level annotations. Furthermore, we find that results from the RSE approach are consistent with existing understanding of fluoroquinolone (FQ) resistance due to mutations in the main drug targets, gyrA and parC, in all three organisms, and suggest the mutational landscape of those genes with respect to FQ resistance is simple. As larger datasets become available, we expect this approach to more reliably predict AMR determinants for a wider range of microbial pathogens.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures




Similar articles
-
A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains.Bioinformatics. 2018 Jul 1;34(13):i89-i95. doi: 10.1093/bioinformatics/bty276. Bioinformatics. 2018. PMID: 29949970 Free PMC article.
-
Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach.BMC Bioinformatics. 2022 Apr 15;23(Suppl 4):131. doi: 10.1186/s12859-022-04666-2. BMC Bioinformatics. 2022. PMID: 35428201 Free PMC article.
-
Keeping up with the pathogens: improved antimicrobial resistance detection and prediction from Pseudomonas aeruginosa genomes.Genome Med. 2024 Jun 7;16(1):78. doi: 10.1186/s13073-024-01346-z. Genome Med. 2024. PMID: 38849863 Free PMC article.
-
Machine learning: novel bioinformatics approaches for combating antimicrobial resistance.Curr Opin Infect Dis. 2017 Dec;30(6):511-517. doi: 10.1097/QCO.0000000000000406. Curr Opin Infect Dis. 2017. PMID: 28914640 Review.
-
Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective.Clin Microbiol Rev. 2022 Sep 21;35(3):e0017921. doi: 10.1128/cmr.00179-21. Epub 2022 May 25. Clin Microbiol Rev. 2022. PMID: 35612324 Free PMC article. Review.
Cited by
-
Deciphering the genomic character of the multidrug-resistant Staphylococcus aureus from Dhaka, Bangladesh.AIMS Microbiol. 2024 Sep 29;10(4):833-858. doi: 10.3934/microbiol.2024036. eCollection 2024. AIMS Microbiol. 2024. PMID: 39628721 Free PMC article.
-
A genomic data resource for predicting antimicrobial resistance from laboratory-derived antimicrobial susceptibility phenotypes.Brief Bioinform. 2021 Nov 5;22(6):bbab313. doi: 10.1093/bib/bbab313. Brief Bioinform. 2021. PMID: 34379107 Free PMC article.
-
Amino Acid k-mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights.Biology (Basel). 2020 Oct 28;9(11):365. doi: 10.3390/biology9110365. Biology (Basel). 2020. PMID: 33126516 Free PMC article.
-
Whole-genome sequencing and gene sharing network analysis powered by machine learning identifies antibiotic resistance sharing between animals, humans and environment in livestock farming.PLoS Comput Biol. 2022 Mar 25;18(3):e1010018. doi: 10.1371/journal.pcbi.1010018. eCollection 2022 Mar. PLoS Comput Biol. 2022. PMID: 35333870 Free PMC article.
-
Machine learning approaches for biomolecular, biophysical, and biomaterials research.Biophys Rev (Melville). 2022 Jun 3;3(2):021306. doi: 10.1063/5.0082179. eCollection 2022 Jun. Biophys Rev (Melville). 2022. PMID: 38505413 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Research Materials