Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 5;35(6):1089-1100.
doi: 10.1021/jasms.3c00403. Epub 2024 May 1.

Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics

Affiliations

Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics

Olatomiwa O Bifarin et al. J Am Soc Mass Spectrom. .

Abstract

Metabolomics generates complex data necessitating advanced computational methods for generating biological insight. While machine learning (ML) is promising, the challenges of selecting the best algorithms and tuning hyperparameters, particularly for nonexperts, remain. Automated machine learning (AutoML) can streamline this process; however, the issue of interpretability could persist. This research introduces a unified pipeline that combines AutoML with explainable AI (XAI) techniques to optimize metabolomics analysis. We tested our approach on two data sets: renal cell carcinoma (RCC) urine metabolomics and ovarian cancer (OC) serum metabolomics. AutoML, using Auto-sklearn, surpassed standalone ML algorithms like SVM and k-Nearest Neighbors in differentiating between RCC and healthy controls, as well as OC patients and those with other gynecological cancers. The effectiveness of Auto-sklearn is highlighted by its AUC scores of 0.97 for RCC and 0.85 for OC, obtained from the unseen test sets. Importantly, on most of the metrics considered, Auto-sklearn demonstrated a better classification performance, leveraging a mix of algorithms and ensemble techniques. Shapley Additive Explanations (SHAP) provided a global ranking of feature importance, identifying dibutylamine and ganglioside GM(d34:1) as the top discriminative metabolites for RCC and OC, respectively. Waterfall plots offered local explanations by illustrating the influence of each metabolite on individual predictions. Dependence plots spotlighted metabolite interactions, such as the connection between hippuric acid and one of its derivatives in RCC, and between GM3(d34:1) and GM3(18:1_16:0) in OC, hinting at potential mechanistic relationships. Through decision plots, a detailed error analysis was conducted, contrasting feature importance for correctly versus incorrectly classified samples. In essence, our pipeline emphasizes the importance of harmonizing AutoML and XAI, facilitating both simplified ML application and improved interpretability in metabolomics data science.

Keywords: Shapley additive explanations; automated machine learning; cancer biology; explainable AI; metabolomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Automated ML-Explainable AI Workflow. (A) Highlight of the challenges associated with ML model selection for nonexperts. Grid and random searches are typically performed by the user to select the best hyperparameters for a model. (B) Auto-Sklearn AutoML system is based on meta-learning, Bayesian optimization, and ensemble construction. (C) Ensemble models constructed via Auto-Sklearn can be interpreted with Explainable AI (XAI) techniques such as LIME and SHAP. (D) Application of AutoML and XAI to a RCC urine and OC serum metabolomics data set. Local interpretable model-agnostic explanations, LIME; Shapley additive explanation, SHAP; renal cell carcinoma, RCC; and ovarian cancer, OC.
Figure 2
Figure 2
Machine learning pipeline. The data set was split into training and test sets, and each was subsequently autoscaled. ML models were built using the training set, and their performances were accessed using the test set. AutoML ensemble models were explained using kernel SHAP.
Figure 3
Figure 3
Automated machine learning pipelines. Pipeline profile showing the pipeline primitives, pipeline matrix, and the corresponding ROC AUC scores for the (A) RCC data set and (B) OC data set. Only 20 successful ML pipelines are shown in each case. The horizontal gray bar indicates the ROC-AUC values, whereas the vertical gray bar represents the correlation of primitives with ROC-AUC scores. (C) Pipeline graph for a sample autoML pipeline. ML pipeline performance over time during model training for the (D) RCC data set and (E) OC data set. The scores reported include the single best score on the internal training set, the single best optimization score, and the ensemble optimization score.
Figure 4
Figure 4
Machine learning interpretations of the ensemble model constructed by AutoML for the RCC data set. (A) Beeswarm plot and (B) summary plot showing global interpretation of the model. (C) Waterfall plot, local explanation for a true positive (RCC) sample. (D) Waterfall plot, local explanation for a true negative (healthy control) sample. (E) Dependence plot showing the interaction between hippuric acid and the hippurate-mannitol derivative. (F) Decision plot highlighting true positive and false negative samples.
Figure 5
Figure 5
Machine learning interpretations of the ensemble model constructed by AutoML for the OC data set. (A) Beeswarm plot and (B) Waterfall plot, local explanation for a true positive (OC) sample. (C) Waterfall plot, local explanation for a true negative (non-OC) sample. (D) Dependence plot showing the interaction between GM3(d34:1) and the GM3(18:1_16:1). (E) Decision plot highlighting true positive and false negative samples for OC and non-OC classification.
Figure 6
Figure 6
Error analysis decision plots for the RCC diagnostic model. (A) Decision plot for all true negative samples. (B) Decision plot for all false positive samples. (C) Feature importance rank correlation between true negative and false positive samples. (D) Changes in feature importance rank between true negatives vs false positives. (E) Decision plot for all true positive samples. (F) Decision plot for all false negative samples. (G) Feature importance rank correlation between true positive and false negative samples. (H) Changes in feature importance rank between true positive vs false negative. Tau is Kendall’s Tau correlation coefficient.

Update of

Similar articles

Cited by

References

    1. Liebal U. W.; Phan A. N. T.; Sudhakar M.; Raman K.; Blank L. M. Machine Learning Applications for Mass Spectrometry-Based Metabolomics. Metabolites 2020, 10 (6), 243.10.3390/metabo10060243. - DOI - PMC - PubMed
    1. Galal A.; Talal M.; Moustafa A. Applications of machine learning in metabolomics: Disease modeling and classification. Front Genet 2022, 13, 1017340.10.3389/fgene.2022.1017340. - DOI - PMC - PubMed
    1. Ren S.; Hinzman A. A.; Kang E. L.; Szczesniak R. D.; Lu L. J. Computational and statistical analysis of metabolomics data. Metabolomics 2015, 11, 1492–1513. 10.1007/s11306-015-0823-6. - DOI
    1. Boccard J.; Rudaz S. Harnessing the complexity of metabolomic data with chemometrics. Journal of Chemometrics 2014, 28 (1), 1–9. 10.1002/cem.2567. - DOI
    1. Zöller M.-A.; Huber M. F. Benchmark and Survey of Automated Machine Learning Frameworks. Journal of Artificial Intelligence Research 2021, 70, 409–474. 10.1613/jair.1.11854. - DOI

Substances