Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 18;15(8):1386-1395.
doi: 10.1021/acsmedchemlett.4c00323. eCollection 2024 Aug 8.

Innovative Multistage ML-QSAR Models for Malaria: From Data to Discovery

Affiliations

Innovative Multistage ML-QSAR Models for Malaria: From Data to Discovery

Joyce V B Borba et al. ACS Med Chem Lett. .

Abstract

Malaria presents a significant challenge to global public health, with around 247 million cases estimated to occur annually worldwide. The growing resistance of Plasmodium parasites to existing therapies underscores the urgent need for new and innovative antimalarial drugs. This study leveraged artificial intelligence (AI) to tackle this complex challenge. We developed multistage Machine Learning Quantitative Structure-Activity Relationship (ML-QSAR) models to effectively analyze large datasets and predict the efficacy of chemical compounds against multiple life cycle stages of Plasmodium parasites. We then selected 16 compounds for experimental evaluation, six of which showed at least dual-stage inhibitory activity and one inhibited all life cycle stages tested. Moreover, explainable AI (XAI) analysis provided insights into critical molecular features influencing model predictions, thereby enhancing our understanding of compound interactions. This study not only empowers the development of advanced predictive AI models but also accelerates the identification and optimization of potential antiplasmodial compounds.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Data scaffold analysis using Murcko scaffolds and ECFP4 descriptors represented in a 2048-bit array. t-SNE was applied to reduce the descriptor dimensions to two. Panels (A, D, H, I, J) show scaffold group distributions within each dataset (ABS-3D7 in A, ABS-W2 in D, gametocytes in H, ookinetes in I, and liver schizonts in J), including counts of unique scaffolds associated with single molecules. Black bars indicate the number of compounds, dark gray bars depict total scaffold occurrences, and light gray bars represent unique scaffold counts. Panels (B, C, E, F, G) display t-SNE projections of the chemical space for ABS-3D7 (B), ABS-W2 (C), gametocytes (E), ookinetes (F), and liver schizonts (G) datasets. Circle sizes correspond to scaffold frequencies in each dataset, with larger circles indicating higher occurrence. Circle colors denote the mean activity of molecules associated with each scaffold, transitioning from red (0–all compounds with that scaffold are inactive) to blue (1–all compounds with that scaffold are active).
Figure 2
Figure 2
Statistical metrics for 5-fold cross-validation (solid bars–mean and standard deviation) and external set validation (hatched bars) of the top performing ML-QSAR models developed for ABS-3D7 (dark pink), ABS-W2 (light pink), gametocytes (orange), ookinetes (blue), and liver schizonts (yellow) datasets. BACC: Balanced Accuracy; F1 score: harmonic mean of precision and recall; MCC: Mathew’s Correlation Coefficient.
Figure 3
Figure 3
Local SHAP interpretation of the models on external set compounds is structurally similar but diverged in their predicted or experimentally observed outcomes. Compounds indicated inside green squares were correctly predicted, whereas those located inside red squares indicate inaccuracies in prediction. Red-highlighted SHAP fragments signify a beneficial effect on the model’s prediction, while blue-highlighted fragments signify a detrimental influence on the model’s prediction. exp. = experimental assignment of compound; pred = predicted assignment of compound; + = active; - = inactive.
Figure 4
Figure 4
Virtual screening workflow based on the five ML-QSAR models for the identification of active compounds against multiple stages of Plasmodium parasites.
Figure 5
Figure 5
A) Heat maps illustrating the predicted probability of activity (upper heat map) and experimental biological activity profiles (lower heat map) of the selected 16 virtual hits. The probability of activity, predicted by ML-QSAR models, ranges from 0% (light green) to 100% (dark blue). Phenotypic experimental screening was conducted using single-point concentrations against various stages of Plasmodium spp., including asexual blood-stage 3D7 and Dd2 strains, gametocytes, ookinetes, and liver schizonts stages. In these heat maps, dark blue indicates 100% inhibition, while light green represents 0% inhibition. B) The most promising experimental hits with their predicted and experimental profiles. These hits show experimental activity equal to or greater than 50% in at least two Plasmodium life stages.
Figure 6
Figure 6
Local SHAP interpretation for the ML-QSAR models on a) each lifecycle stage of plasmodium (1- ABS-3D7, 2- ABS-W2, 3- ookinetes stage, 4- liver schizonts stage) for the compound LDT-695. In the plot, red denotes a positive impact, while blue signifies a negative impact on the model prediction. b) The most important bits on compound LDT-695 contribute to each model’s predictions. Blue contour atoms: represent the central atom in the environment; yellow: aromatic atoms; gray: aliphatic ring atoms. c) Force plots for local SHAP contributions and highlighted fragments in LDT-695 corresponding to the most frequent bits.

References

    1. Kaslow D. C. Malaria Vaccine Research & Innovation: The Intersection of IA2030 and Zero Malaria. npj Vaccines 2020, 5 (1), 109.10.1038/s41541-020-00259-3. - DOI - PMC - PubMed
    1. WHO . World Malaria World Malaria Report Report; 2023.
    1. Merrick C. J. Hypnozoites in Plasmodium: Do Parasites Parallel Plants?. Trends Parasitol. 2021, 37 (4), 273–282. 10.1016/j.pt.2020.11.001. - DOI - PubMed
    1. Hollin T.; Le Roch K. G. From Genes to Transcripts, a Tightly Regulated Journey in Plasmodium. Front. Cell. Infect. Microbiol. 2020, 10, 618454.10.3389/fcimb.2020.618454. - DOI - PMC - PubMed
    1. Huang Z.; Li R.; Tang T.; Ling D.; Wang M.; Xu D.; Sun M.; Zheng L.; Zhu F.; Min H.; et al. A Novel Multistage Antiplasmodial Inhibitor Targeting Plasmodium Falciparum Histone Deacetylase 1. Cell Discovery 2020, 6 (1), 93.10.1038/s41421-020-00215-4. - DOI - PMC - PubMed

LinkOut - more resources