Explainable AI identifies diagnostic cells of genetic AML subtypes

Matthias Hehr^{1

2

3}, Ario Sadafi^{1

2

4}, Christian Matek^{1

2

3}, Peter Lienemann^{1

3}, Christian Pohlkamp⁵, Torsten Haferlach⁵, Karsten Spiekermann^{3

6

7}, Carsten Marr^{1

2}

Affiliations

¹ Institute of AI for Health, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
² Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
³ Laboratory of Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.
⁴ Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany.
⁵ Munich Leukemia Laboratory, Munich, Germany.
⁶ German Cancer Consortium (DKTK), Heidelberg, Germany.
⁷ German Cancer Research Center (DKFZ), Heidelberg, Germany.

PMID: 36921004
PMCID: PMC10016704
DOI: 10.1371/journal.pdig.0000187

Explainable AI identifies diagnostic cells of genetic AML subtypes

Matthias Hehr et al. PLOS Digit Health. 2023.

. 2023 Mar 15;2(3):e0000187.

doi: 10.1371/journal.pdig.0000187. eCollection 2023 Mar.

Authors

Matthias Hehr^{1

2

3}, Ario Sadafi^{1

2

4}, Christian Matek^{1

2

3}, Peter Lienemann^{1

3}, Christian Pohlkamp⁵, Torsten Haferlach⁵, Karsten Spiekermann^{3

6

7}, Carsten Marr^{1

2}

Affiliations

¹ Institute of AI for Health, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
² Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
³ Laboratory of Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.
⁴ Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany.
⁵ Munich Leukemia Laboratory, Munich, Germany.
⁶ German Cancer Consortium (DKTK), Heidelberg, Germany.
⁷ German Cancer Research Center (DKFZ), Heidelberg, Germany.

PMID: 36921004
PMCID: PMC10016704
DOI: 10.1371/journal.pdig.0000187

Abstract

Explainable AI is deemed essential for clinical applications as it allows rationalizing model predictions, helping to build trust between clinicians and automated decision support tools. We developed an inherently explainable AI model for the classification of acute myeloid leukemia subtypes from blood smears and found that high-attention cells identified by the model coincide with those labeled as diagnostically relevant by human experts. Based on over 80,000 single white blood cell images from digitized blood smears of 129 patients diagnosed with one of four WHO-defined genetic AML subtypes and 60 healthy controls, we trained SCEMILA, a single-cell based explainable multiple instance learning algorithm. SCEMILA could perfectly discriminate between AML patients and healthy controls and detected the APL subtype with an F1 score of 0.86±0.05 (mean±s.d., 5-fold cross-validation). Analyzing a novel multi-attention module, we confirmed that our algorithm focused with high concordance on the same AML-specific cells as human experts do. Applied to classify single cells, it is able to highlight subtype specific cells and deconvolve the composition of a patient's blood smear without the need of single-cell annotation of the training data. Our large AML genetic subtype dataset is publicly available, and an interactive online tool facilitates the exploration of data and predictions. SCEMILA enables a comparison of algorithmic and expert decision criteria and can present a detailed analysis of individual patient data, paving the way to deploy AI in the routine diagnostics for identifying hematopoietic neoplasms.

Copyright: © 2023 Hehr et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

I have read the journal’s policy and the authors of this manuscript have the following competing interests: TH declares part ownership of Munich Leukemia Laboratory (MLL). CP declares employment at MLL.

Figures

**Fig 1. Single-cell based multiple instance learning predicts AML genetic subtypes with high accuracy and identifies clinically relevant cells.**
(a) Our cohort consists of 81214 single-cell images from 189 individuals comprising four genetic AML subtypes (APL with *PML*::*RARA* fusion (n = 24), AML with *NPM1* mutation (n = 36), AML with *CBFB*::*MYH11* fusion (n = 37), AML with *RUNX1*::*RUNX1T1* fusion (n = 32)) and 60 healthy controls. Single white blood cells were selected and digitized from peripheral blood smears (see Methods). The single-cell images of an exemplary *PML*::*RARA* patient (bottom) contain a *PML*::*RARA* specific faggot cell (right) with clearly visible bundles of Auer rods (arrow). (b) Our single-cell based explainable multiple instance learning algorithm (SCEMILA) classifies patients with an arbitrary number of single-cell images and sorts them by attention. From N single-cell images, features are extracted with a pre-trained convolutional neural network f_feat. A bag feature vector is generated via matrix multiplication, which allows patient classification. Moreover, single-cell images can be ordered according to attention, i.e. their importance for a class probability.

**Fig 2. SCEMILA’s single-cell attention coincides with experts’ diagnostic relevance.**
(a) SCEMILA perfectly differentiated the 129 AML patients from 60 stem cell donors. The cytogenetically different AML genetic subtypes can be distinguished with F1 scores of 0.86±0.05 (*PML*::*RARA*, mean±s.d. from 5 cross validation runs), 0.75±0.06 (*NPM1*), 0.69±0.09 (*CBFB*::*MYH11*), and 0.75±0.15 (*RUNX1*::*RUNX1T1*). Additionally, the ROC-AUC scores for both the specificity/sensitivity (left value) and the precision/recall characteristic (right value) are presented. (b) An expert cytologist annotated one patient per AML entity, assigning diagnostic relevance (Fig 2C) and the morphological cell type (Fig 2E, 2F). (c) In the top attention quartile, 89% of the cells have been annotated diagnostically relevant for AML by an expert hematologist. In contrast, 77% of cells in the low attention quartile have been deemed healthy or AML unrelated. Cells with medium attention ’indicate AML’ or ‘can indicate AML’, according to the expert. Every data point represents a single-cell image from an exemplary, correctly classified *PML*::*RARA* patient. Ticks and pie charts at the top show quartile ranges and diagnostic relevance distribution within quartiles. (d) Individual single-cell images from the *PML*::*RARA* patient show a diverse morphology. The cell with the highest attention (rightmost image) shows strong cytoplasmic granularity and a bilobed, large nucleus. Cells with low attention (leftmost image) show no malignant features. (e) According to expert annotation, atypical promyelocytes (orange) and myeloblasts (red) receive highest attention, while SCEMILA pays little attention to physiological cell types such as neutrophil granulocytes (green), lymphocytes (blue) or other (gray).

**Fig 3. SCEMILA deconvolves patient composition and identifies representative cells for AML subtypes.**
(a) By passing instances individually through SCEMILA, the algorithm returns single cell predictions. (b) Bar plots show predictions for every single cell of each individual in our dataset. Cells are ordered by attention, with low-attention cells on the left and high-attention cells on the right side. Patients which were classified with strong output activation are presented further up in the list, and the corresponding bag prediction is encoded in the patient label color next to the column. (c) The top 10 cells (2 per fold, based on single cell predictions made by SCEMILA) are displayed for every entity, and single-cell morphology matches with existing medical knowledge. SCEMILA recognizes atypical promyelocytes with strong granulation (*PML*::*RARA*), cup-like blasts (*NPM1*) and myelomonocytic cells (*CBFB*::*MYH11*) as relevant for the respective entities. For our algorithm, smaller blasts are associated with *RUNX1*::*RUNX1T1*, while neutrophil granulocytes are classified as controls. (d) The UMAP embedding shows all cells from the first test set fold. Within the UMAP, the color code represents individual single cell predictions made by our algorithm, with the intensity representing the model output for the respective class. Our approach delineates dedicated morphological cell clusters for every AML entity as well as the controls.

See this image and copyright information in PMC

References

1. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28: 31–38. doi: 10.1038/s41591-021-01614-0 - DOI - PubMed
1. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence. 2019;1: 206–215. doi: 10.1038/s42256-019-0048-x - DOI - PMC - PubMed
1. Campanella G, Hanna MG, Geneslaw L, Miraflor A, Werneck Krauss Silva V, Busam KJ, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med. 2019. doi: 10.1038/s41591-019-0508-1 - DOI - PMC - PubMed
1. Kather JN, Krisam J, Charoentong P, Luedde T, Herpel E, Weis C-A, et al. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med. 2019;16: e1002730. doi: 10.1371/journal.pmed.1002730 - DOI - PMC - PubMed
1. Arvaniti E, Fricker KS, Moret M, Rupp N, Hermanns T, Fankhauser C, et al. Automated Gleason grading of prostate cancer tissue microarrays via deep learning. Sci Rep. 2018;8: 12054. doi: 10.1038/s41598-018-30535-1 - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explainable AI identifies diagnostic cells of genetic AML subtypes

Affiliations

Explainable AI identifies diagnostic cells of genetic AML subtypes

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources