Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 14;14(1):24045.
doi: 10.1038/s41598-024-71169-w.

Explainable artificial intelligence (XAI) to find optimal in-silico biomarkers for cardiac drug toxicity evaluation

Affiliations

Explainable artificial intelligence (XAI) to find optimal in-silico biomarkers for cardiac drug toxicity evaluation

Muhammad Adnan Pramudito et al. Sci Rep. .

Abstract

The Comprehensive In-vitro Proarrhythmia Assay (CiPA) initiative aims to refine the assessment of drug-induced torsades de pointes (TdP) risk, utilizing computational models to predict cardiac drug toxicity. Despite advancements in machine learning applications for this purpose, the specific contribution of in-silico biomarkers to toxicity risk levels has yet to be thoroughly elucidated. This study addresses this gap by implementing explainable artificial intelligence (XAI) to illuminate the impact of individual biomarkers in drug toxicity prediction. We employed the Markov chain Monte Carlo method to generate a detailed dataset for 28 drugs, from which twelve in-silico biomarkers of 12 drugs were computed to train various machine learning models, including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Random Forests (RF), XGBoost, K-Nearest Neighbors (KNN), and Radial Basis Function (RBF) networks. Our study's innovation is leveraging XAI, mainly through the SHAP (SHapley Additive exPlanations) method, to dissect and quantify the contributions of biomarkers across these models. Furthermore, the model performance was evaluated using the test set from 16 drugs. We found that the ANN model coupled with the eleven most influential in-silico biomarkers namely dVm dt repol , dVm dt max , APD 90 , APD 50 , APD tri , CaD 90 , CaD 50 , Ca tri , Ca Diastole , q I n w a r d , a n d q N e t showed the highest classification performance among all classifiers with Area Under the Curve (AUC) scores of 0.92 for predicting high-risk, 0.83 for intermediate-risk, and 0.98 for low-risk drugs. We also found that the optimal in silico biomarkers selected based on SHAP analysis may be different for various classification models. However, we also found that the biomarker selection only sometimes improved the performance; therefore, evaluating various classifiers is still essential to obtain the desired classification performance. Our proposed method could provide a systematic way to assess the best classifier with the optimal in-silico biomarkers for predicting the TdP risk of drugs, thereby advancing the field of cardiac safety evaluations.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Illustration of our proposed algorithm for evaluating proarrhythmic drug risk, identifying key biomarkers in high, intermediate, and low-risk groups. The figure outlines the process, including preprocessing in-vitro data, sample generation, feature variability simulation, and identifying significant biomarkers using XAI algorithms (ANN). System performance is evaluated through model testing based on the XAI approach.
Fig. 2
Fig. 2
Illustration of in-silico biomarkers in AP profile and Ca profile, which consisted of repolarization dVmdtrepol,dVmdtmax,Vmresting,APD90,APD50,APDtri,CaD90,CaD50,Catri,CaDiastole,qInward, and qNet.
Fig. 3
Fig. 3
(a) Schematic representation of the ANN classification model, which employs twelve in-silico biomarkers as inputs. The model architecture comprises three hidden layers, each consisting of six neurons. Outputs from the ANN model are categorized into three risk classes: high-risk, intermediate-risk, and low-risk for TdP. (b) Illustration of the XGBoost classifier model. This model utilizes twelve in-silico features to train an ensemble of 200 decision trees. The trees are built sequentially with each tree learning from the errors (residuals) of the previous ones, thereby refining the classification. The node splitting is guided by an objective function, and the final output is the sum of predictions from all trees. (c) Depiction of the Random Forest training process. Starting with twelve in-silico features, the method employs bootstrap sampling to create multiple training sets. Each set is used to train a decision tree, resulting in 200 trees. The classification outcome for a sample is then determined by majority voting or averaging the results from all decision trees. (d) Workflow of the KNN classifier. The process begins with twelve in-silico features and initializes using K mean. It calculates the distance between training and testing points, sorts by distance, and applies majority voting for TdP risk classification, resulting in the final output. (e) Process diagram for the SVM classifier. The model initiates with twelve in-silico features and uses kernel mean initialization. Parameters Y & C are then established, followed by the training phase, culminating in the TdP risk classification. (f) Diagram of the RBF. The network starts with twelve in-silico features leading into a hidden layer that applies the RBF for transformation. The output is divided into three risk categories: high, intermediate, and low risk for TdP.
Fig. 4
Fig. 4
An evaluation algorithm was employed to assess the performance of the classification model proposed by the CiPA research group, utilizing the principles of the central limit theorem; AUC, the area under the receiver operating curve; LR, likelihood ratio.
Fig. 5
Fig. 5
(a) Feature importance visualization for the ANN model. The bar chart displays the mean SHAP values of each in-silico biomarker across three classes: high-risk, intermediate-risk, and low-risk. The feature qInward appears to be the most influential for high-risk classification, while Vm Resting has the least impact. (b) Feature importance chart for the XGBoost model. The graph illustrates the average impact of each in-silico biomarker on the model’s output, with higher mean SHAP values indicating greater importance. For high-risk classification, qInward and dVmdt Repol show significant influence. (c) Summary of SHAP values by class for the RF model. This bar chart represents the mean SHAP values by class, highlighting the features that most strongly affect the model’s predictions, with qInward showing a high impact on the high-risk class. (d) SHAP value summary for the SVM model. The bar chart details the feature importance, where CaD_50 is notably influential across all risk categories, suggesting a critical role in the model’s risk stratification process. (e) Feature importance for the KNN model, depicted through mean SHAP values. CaD_50 and APD_90 stand out as key features with high importance for the high-risk and intermediate-risk classifications, respectively. (f) Visualization of feature importance for the RBF model. The chart highlights the mean SHAP values with dVmdt Max showing a prominent role in distinguishing across all of risk category.

Similar articles

Cited by

References

    1. Li, M. & Ramos, L. G. Drug-Induced QT Prolongation And Torsades de Pointes PHARMACOVIGILANCE FORUM. P&T® vol. 42 www.crediblemeds.org (2017). - PMC - PubMed
    1. Gintant, G. A. Preclinical Torsades-de-Pointes Screens: Advantages and limitations of surrogate and direct approaches in evaluating proarrhythmic risk. Pharmacol. Therap.119, 199–209. 10.1016/j.pharmthera.2008.04.010 (2008). - PubMed
    1. Crumb, W. J., Vicente, J., Johannesen, L. & Strauss, D. G. An evaluation of 30 clinical drugs against the comprehensive in vitro proarrhythmia assay (CiPA) proposed ion channel panel. J. Pharmacol. Toxicol. Methods81, 251–262 (2016). - PubMed
    1. Sager, P. T., Gintant, G., Turner, J. R., Pettit, S. & Stockbridge, N. Rechanneling the cardiac proarrhythmia safety paradigm: A meeting report from the Cardiac Safety Research Consortium. Am. Heart J.167, 292–300. 10.1016/j.ahj.2013.11.004 (2014). - PubMed
    1. Strauss, D. G. et al. Comprehensive in vitro proarrhythmia assay (CiPA) update from a Cardiac Safety Research Consortium/Health and Environmental Sciences Institute/FDA meeting. Ther. Innov. Regul. Sci.53, 519–525 (2019). - PubMed

LinkOut - more resources