Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Nov;17(11):e70056.
doi: 10.1111/cts.70056.

Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development

Affiliations
Review

Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development

Ana Victoria Ponce-Bobadilla et al. Clin Transl Sci. 2024 Nov.

Abstract

Despite increasing interest in using Artificial Intelligence (AI) and Machine Learning (ML) models for drug development, effectively interpreting their predictions remains a challenge, which limits their impact on clinical decisions. We address this issue by providing a practical guide to SHapley Additive exPlanations (SHAP), a popular feature-based interpretability method, which can be seamlessly integrated into supervised ML models to gain a deeper understanding of their predictions, thereby enhancing their transparency and trustworthiness. This tutorial focuses on the application of SHAP analysis to standard ML black-box models for regression and classification problems. We provide an overview of various visualization plots and their interpretation, available software for implementing SHAP, and highlight best practices, as well as special considerations, when dealing with binary endpoints and time-series models. To enhance the reader's understanding for the method, we also apply it to inherently explainable regression models. Finally, we discuss the limitations and ongoing advancements aimed at tackling the current drawbacks of the method.

PubMed Disclaimer

Conflict of interest statement

All authors are employees of AbbVie and may hold AbbVie stock.

Figures

FIGURE 1
FIGURE 1
Standard supervised ML workflow.
FIGURE 2
FIGURE 2
Different visualization plots of SHAP values from an XGBoost model when predicting blood pressure: (a) Bar plot; (b) Beeswarm plot; (c) A scatter plot for the feature age colored by each subject's BMI; (d) Waterfall plot for an example subject.
FIGURE 3
FIGURE 3
Scatter plot of SHAP values for feature age faceted by train‐test status when considering cross‐validation. The trends across the different folds are depicted in different lines corresponding to the fold.
FIGURE 4
FIGURE 4
Visualization plots of SHAP values derived from an XGBoost model for a classification problem, explaining the predicted probabilities (a,c,e) and the predicted log‐odds (b,d,f).
FIGURE 5
FIGURE 5
(a) Example time‐course of the PK model considered to model drug concentration. Different visualization plots for the SHAP values explaining the predictions of individual clearances are depicted; (b) Bar plot; (c) Beeswarm plot; (d–f) Scatter plots of SHAP values corresponding to concentration at different times.
FIGURE 6
FIGURE 6
Visualizations plots for SHAP values of different ML regression models; (a,c,e) Bar plots; (b,d,f) Beeswarm plots.

References

    1. Liu Q, Zhu H, Liu C, et al. Application of machine learning in drug development and regulation: current status and future potential. Clin Pharmacol Ther. 2020;107:726‐729. - PubMed
    1. Terranova N, Renard D, Shahin MH, et al. Artificial intelligence for quantitative modeling in drug discovery and development: an innovation and quality consortium perspective on use cases and best practices. Clin Pharmacol Ther. 2024;115:658‐672. - PubMed
    1. Marques L, Costa B, Pereira M, et al. Advancing precision medicine: a review of innovative in silico approaches for drug development, clinical pharmacology and personalized healthcare. Pharmaceutics. 2024;16:332. 10.3390/pharmaceutics16030332 - DOI - PMC - PubMed
    1. Bhhatarai B, Walters WP, Hop C, Lanza G, Ekins S. Opportunities and challenges using artificial intelligence in ADME/Tox. Nat Mater. 2019;18:418‐422. - PMC - PubMed
    1. Zhang W, Roy Burman SS, Chen J, et al. Machine learning modeling of protein‐intrinsic features predicts tractability of targeted protein degradation. Genomics Proteomics Bioinformatics. 2022;20:882‐898. - PMC - PubMed

LinkOut - more resources