Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun;37(5-6):265-278.
doi: 10.1007/s10822-023-00505-5. Epub 2023 Apr 22.

TargIDe: a machine-learning workflow for target identification of molecules with antibiofilm activity against Pseudomonas aeruginosa

Affiliations

TargIDe: a machine-learning workflow for target identification of molecules with antibiofilm activity against Pseudomonas aeruginosa

João Carneiro et al. J Comput Aided Mol Des. 2023 Jun.

Abstract

Bacterial biofilms are a source of infectious human diseases and are heavily linked to antibiotic resistance. Pseudomonas aeruginosa is a multidrug-resistant bacterium widely present and implicated in several hospital-acquired infections. Over the last years, the development of new drugs able to inhibit Pseudomonas aeruginosa by interfering with its ability to form biofilms has become a promising strategy in drug discovery. Identifying molecules able to interfere with biofilm formation is difficult, but further developing these molecules by rationally improving their activity is particularly challenging, as it requires knowledge of the specific protein target that is inhibited. This work describes the development of a machine learning multitechnique consensus workflow to predict the protein targets of molecules with confirmed inhibitory activity against biofilm formation by Pseudomonas aeruginosa. It uses a specialized database containing all the known targets implicated in biofilm formation by Pseudomonas aeruginosa. The experimentally confirmed inhibitors available on ChEMBL, together with chemical descriptors, were used as the input features for a combination of nine different classification models, yielding a consensus method to predict the most likely target of a ligand. The implemented algorithm is freely available at https://github.com/BioSIM-Research-Group/TargIDe under licence GNU General Public Licence (GPL) version 3 and can easily be improved as more data become available.

Keywords: Biofilms; Ligand targets; Machine learning; Pseudomonas aeruginosa.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Molecular representation of the protein targets included in this study and the number of ligands used per target
Fig. 2
Fig. 2
Curated training database 10 most relevant features using RF and selected after the recursive feature elimination and cross-validation process
Fig. 3
Fig. 3
Bar graphs of the evaluation of several classification model predictions on the test (10%) and training (90%) datasets using the 10 most relevant features. The cross-validation values are also shown. Bar values for train/test/cross-validation represent, from left to right, recall, precision, F1 score, AUC. and classification accuracy (CA).
Fig. 4
Fig. 4
Applicability domain calculated for the train and test dataset calculated by PCA bounding box. The first principal component (PCA1) is the direction in which the data varies the most. The second principal component (PCA2) is orthogonal to the first and represents the direction of maximum variance that is not captured by the first principal component
Fig. 5
Fig. 5
ROC One-vs-rest curve for the 5 target (gene) classes with higher number of samples considering the best performing model (gradient boosting) for the training SigMol dataset. The graph shows the mean of the true positive rate (TP rate) and the false positive rate (FP rate)
Fig. 6
Fig. 6
Workflow of machine learning models implementation in Orange software

References

    1. Worthington RJ, Richards JJ, Melander C. Small molecule control of bacterial biofilms. Org Biomol Chem. 2012;10:7457–7474. doi: 10.1039/c2ob25835h. - DOI - PMC - PubMed
    1. Hall-Stoodley L, Costerton JW, Stoodley P. Bacterial biofilms: from the natural environment to infectious diseases. Nat Rev Microbiol. 2004;2:95–108. doi: 10.1038/nrmicro821. - DOI - PubMed
    1. Donlan Rodney M. Biofilms: microbial life on surfaces. Emerg Infect Dis. 2002;8:881–890. doi: 10.3201/eid0809.020063. - DOI - PMC - PubMed
    1. Singh PK, Schaefer AL, Parsek MR, et al. Quorum-sensing signals indicate that cystic fibrosis lungs are infected with bacterial biofilms. Nature. 2000;407:762–764. doi: 10.1038/35037627. - DOI - PubMed
    1. Davies D. Understanding biofilm resistance to antibacterial agents. Nat Rev Drug Discov. 2003;2:114–122. doi: 10.1038/nrd1008. - DOI - PubMed

Publication types

Substances