. 2024 Jan 15;15(1):524.

doi: 10.1038/s41467-023-43095-4.

Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

Tirtha Chanda^#¹, Katja Hauser^#¹, Sarah Hobelsberger^#², Tabea-Clara Bucher¹, Carina Nogueira Garcia¹, Christoph Wies^{1

3}, Harald Kittler⁴, Philipp Tschandl⁴, Cristian Navarrete-Dechent⁵, Sebastian Podlipnik⁶, Emmanouil Chousakos⁷, Iva Crnaric⁸, Jovana Majstorovic⁹, Linda Alhajwan¹⁰, Tanya Foreman¹¹, Sandra Peternel¹², Sergei Sarap¹³, İrem Özdemir¹⁴, Raymond L Barnhill¹⁵, Mar Llamas-Velasco¹⁶, Gabriela Poch¹⁷, Sören Korsing¹⁸, Wiebke Sondermann¹⁹, Frank Friedrich Gellrich², Markus V Heppt¹⁹, Michael Erdmann¹⁹, Sebastian Haferkamp²⁰, Konstantin Drexler²⁰, Matthias Goebeler²¹, Bastian Schilling²¹, Jochen S Utikal²², Kamran Ghoreschi¹⁷, Stefan Fröhling²³, Eva Krieghoff-Henning¹; Reader Study Consortium; Titus J Brinker²⁴

Collaborators, Affiliations

Collaborators

Reader Study Consortium:
Alexander Salava, Alexander Thiem, Alexandris Dimitrios, Amr Mohammad Ammar, Ana Sanader Vučemilović, Andrea Miyuki Yoshimura, Andzelka Ilieva, Anja Gesierich, Antonia Reimer-Taschenbrecker, Antonios G A Kolios, Arturs Kalva, Arzu Ferhatosmanoğlu, Aude Beyens, Claudia Pföhler, Dilara Ilhan Erdil, Dobrila Jovanovic, Emoke Racz, Falk G Bechara, Federico Vaccaro, Florentia Dimitriou, Gunel Rasulova, Hulya Cenk, Irem Yanatma, Isabel Kolm, Isabelle Hoorens, Iskra Petrovska Sheshova, Ivana Jocic, Jana Knuever, Janik Fleißner, Janis Raphael Thamm, Johan Dahlberg, Juan José Lluch-Galcerá, Juan Sebastián Andreani Figueroa, Julia Holzgruber, Julia Welzel, Katerina Damevska, Kristine Elisabeth Mayer, Lara Valeska Maul, Laura Garzona-Navas, Laura Isabell Bley, Laurenz Schmitt, Lena Reipen, Lidia Shafik, Lidija Petrovska, Linda Golle, Luise Jopen, Magda Gogilidze, Maria Rosa Burg, Martha Alejandra Morales-Sánchez, Martyna Sławińska, Miriam Mengoni, Miroslav Dragolov, Nicolás Iglesias-Pena, Nina Booken, Nkechi Anne Enechukwu, Oana-Diana Persa, Olumayowa Abimbola Oninla, Panagiota Theofilogiannakou, Paula Kage, Roque Rafael Oliveira Neto, Rosario Peralta, Rym Afiouni, Sandra Schuh, Saskia Schnabl-Scheu, Seçil Vural, Sharon Hudson, Sonia Rodriguez Saa, Sören Hartmann, Stefana Damevska, Stefanie Finck, Stephan Alexander Braun, Tim Hartmann, Tobias Welponer, Tomica Sotirovski, Vanda Bondare-Ansberga, Verena Ahlgrimm-Siess, Verena Gerlinde Frings, Viktor Simeonovski, Zorica Zafirovik, Julia-Tatjana Maul, Saskia Lehr, Marion Wobser, Dirk Debus, Hassan Riad, Manuel P Pereira, Zsuzsanna Lengyel, Alise Balcere, Amalia Tsakiri, Ralph P Braun

Affiliations

¹ Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
² Department of Dermatology, University Hospital, Technical University Dresden, Dresden, Germany.
³ Medical Faculty of University Heidelberg, Heidelberg, Germany.
⁴ Department of Dermatology, Medical University of Vienna, Vienna, Austria.
⁵ Department of Dermatology, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile.
⁶ Dermatology Department, Hospital Clínic of Barcelona, University of Barcelona, IDIBAPS, Barcelona, Spain.
⁷ 1st Department of Pathology, Medical School, National & Kapodistrian University of Athens, Athens, Greece.
⁸ Department of Dermatovenereology, Sestre milosrdnice University Hospital Center, Zagreb, Croatia.
⁹ Derma Style, Dermatovenerology clinic, Belgrade, Serbia.
¹⁰ Department of Dermatology, Dubai London Clinic, Dubai, United Arab Emirates.
¹¹ West Dermatology, Newport Beach, California, USA.
¹² Department of Dermatovenereology, Clinical Hospital Center Rijeka, Faculty of Medicine, University of Rijeka, Rijeka, Croatia.
¹³ LaserMed, Tallinn, Estonia.
¹⁴ Department of Dermatology, Faculty of Medicine, Gazi University, Ankara, Turkey.
¹⁵ Department of Translational Research, Institut Curie, Unit of Formation and Research of Medicine University of Paris, Paris, France.
¹⁶ Universidad Autónoma de Madrid, Madrid, Spain.
¹⁷ Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany.
¹⁸ Department of Dermatology, University Hospital Essen, University Duisburg-Essen, Essen, Germany.
¹⁹ Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
²⁰ Department of Dermatology, University Hospital Regensburg, Regensburg, Germany.
²¹ Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany.
²² Department of Dermatology, Venereology and Allergology, University Medical Center Mannheim, Ruprecht-Karl University of Heidelberg, Mannheim, Germany.
²³ Division of Translational Medical Oncology, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany.
²⁴ Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany. titus.brinker@dkfz.de.

^# Contributed equally.

PMID: 38225244
PMCID: PMC10789736
DOI: 10.1038/s41467-023-43095-4

Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

Tirtha Chanda et al. Nat Commun. 2024.

. 2024 Jan 15;15(1):524.

doi: 10.1038/s41467-023-43095-4.

Authors

Collaborators

Reader Study Consortium:
Alexander Salava, Alexander Thiem, Alexandris Dimitrios, Amr Mohammad Ammar, Ana Sanader Vučemilović, Andrea Miyuki Yoshimura, Andzelka Ilieva, Anja Gesierich, Antonia Reimer-Taschenbrecker, Antonios G A Kolios, Arturs Kalva, Arzu Ferhatosmanoğlu, Aude Beyens, Claudia Pföhler, Dilara Ilhan Erdil, Dobrila Jovanovic, Emoke Racz, Falk G Bechara, Federico Vaccaro, Florentia Dimitriou, Gunel Rasulova, Hulya Cenk, Irem Yanatma, Isabel Kolm, Isabelle Hoorens, Iskra Petrovska Sheshova, Ivana Jocic, Jana Knuever, Janik Fleißner, Janis Raphael Thamm, Johan Dahlberg, Juan José Lluch-Galcerá, Juan Sebastián Andreani Figueroa, Julia Holzgruber, Julia Welzel, Katerina Damevska, Kristine Elisabeth Mayer, Lara Valeska Maul, Laura Garzona-Navas, Laura Isabell Bley, Laurenz Schmitt, Lena Reipen, Lidia Shafik, Lidija Petrovska, Linda Golle, Luise Jopen, Magda Gogilidze, Maria Rosa Burg, Martha Alejandra Morales-Sánchez, Martyna Sławińska, Miriam Mengoni, Miroslav Dragolov, Nicolás Iglesias-Pena, Nina Booken, Nkechi Anne Enechukwu, Oana-Diana Persa, Olumayowa Abimbola Oninla, Panagiota Theofilogiannakou, Paula Kage, Roque Rafael Oliveira Neto, Rosario Peralta, Rym Afiouni, Sandra Schuh, Saskia Schnabl-Scheu, Seçil Vural, Sharon Hudson, Sonia Rodriguez Saa, Sören Hartmann, Stefana Damevska, Stefanie Finck, Stephan Alexander Braun, Tim Hartmann, Tobias Welponer, Tomica Sotirovski, Vanda Bondare-Ansberga, Verena Ahlgrimm-Siess, Verena Gerlinde Frings, Viktor Simeonovski, Zorica Zafirovik, Julia-Tatjana Maul, Saskia Lehr, Marion Wobser, Dirk Debus, Hassan Riad, Manuel P Pereira, Zsuzsanna Lengyel, Alise Balcere, Amalia Tsakiri, Ralph P Braun

Affiliations

¹ Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
² Department of Dermatology, University Hospital, Technical University Dresden, Dresden, Germany.
³ Medical Faculty of University Heidelberg, Heidelberg, Germany.
⁴ Department of Dermatology, Medical University of Vienna, Vienna, Austria.
⁵ Department of Dermatology, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile.
⁶ Dermatology Department, Hospital Clínic of Barcelona, University of Barcelona, IDIBAPS, Barcelona, Spain.
⁷ 1st Department of Pathology, Medical School, National & Kapodistrian University of Athens, Athens, Greece.
⁸ Department of Dermatovenereology, Sestre milosrdnice University Hospital Center, Zagreb, Croatia.
⁹ Derma Style, Dermatovenerology clinic, Belgrade, Serbia.
¹⁰ Department of Dermatology, Dubai London Clinic, Dubai, United Arab Emirates.
¹¹ West Dermatology, Newport Beach, California, USA.
¹² Department of Dermatovenereology, Clinical Hospital Center Rijeka, Faculty of Medicine, University of Rijeka, Rijeka, Croatia.
¹³ LaserMed, Tallinn, Estonia.
¹⁴ Department of Dermatology, Faculty of Medicine, Gazi University, Ankara, Turkey.
¹⁵ Department of Translational Research, Institut Curie, Unit of Formation and Research of Medicine University of Paris, Paris, France.
¹⁶ Universidad Autónoma de Madrid, Madrid, Spain.
¹⁷ Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany.
¹⁸ Department of Dermatology, University Hospital Essen, University Duisburg-Essen, Essen, Germany.
¹⁹ Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
²⁰ Department of Dermatology, University Hospital Regensburg, Regensburg, Germany.
²¹ Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany.
²² Department of Dermatology, Venereology and Allergology, University Medical Center Mannheim, Ruprecht-Karl University of Heidelberg, Mannheim, Germany.
²³ Division of Translational Medical Oncology, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany.
²⁴ Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany. titus.brinker@dkfz.de.

^# Contributed equally.

PMID: 38225244
PMCID: PMC10789736
DOI: 10.1038/s41467-023-43095-4

Abstract

Artificial intelligence (AI) systems have been shown to help dermatologists diagnose melanoma more accurately, however they lack transparency, hindering user acceptance. Explainable AI (XAI) methods can help to increase transparency, yet often lack precise, domain-specific explanations. Moreover, the impact of XAI methods on dermatologists' decisions has not yet been evaluated. Building upon previous research, we introduce an XAI system that provides precise and domain-specific explanations alongside its differential diagnoses of melanomas and nevi. Through a three-phase study, we assess its impact on dermatologists' diagnostic accuracy, diagnostic confidence, and trust in the XAI-support. Our results show strong alignment between XAI and dermatologist explanations. We also show that dermatologists' confidence in their diagnoses, and their trust in the support system significantly increase with XAI compared to conventional AI. This study highlights dermatologists' willingness to adopt such XAI systems, promoting future use in the clinic.

PubMed Disclaimer

Conflict of interest statement

PT reports grants from Lilly, consulting fees from Silverchair, lecture honoraria from Lilly, FotoFinder and Novartis, outside of the present publication. TJB owns a company that develops mobile apps (Smart Health Heidelberg GmbH, Heidelberg, Germany), outside of the scope of the submitted work. WS received travel support for participation in congresses and / or (speaker) honoraria as well as research grants from medi GmbH Bayreuth, Abbvie, Almirall, Amgen, Bristol-Myers Squibb, Celgene, GSK, Janssen, LEO Pharma, Lilly, MSD, Novartis, Pfizer, Roche, Sanofi Genzyme, and UCB outside of the present publication. MLV received travel support for participation in congresses and / or (speaker) honoraria as well as research grants from Abbvie, Almirall, Amgen, Bristol-Myers Squibb, Celgene, Janssen, Kyowa Kirin, LEO Pharma, Lilly, MSD, Novartis, Pfizer, Roche, Sanofi Genzyme, and UCB outside of the present publication. BS is on the advisory board or has received honoraria from Immunocore, Almirall, Pfizer, Sanofi, Novartis, Roche, BMS and MSD, research funding from Novartis and Pierre Fabre Pharmaceuticals, and travel support from Novartis, Roche, Bristol-Myers Squibb and Pierre Fabre Pharma, outside the submitted work. SH is on the advisory board or has received honoraria from Novartis, Pierre Fabre, BMS and MSD outside the submitted work. KD has received honoraria from Novartis, Pierre Fabre and Roche outside the submitted work. SF reports consulting or advisory board membership: Bayer, Illumina, Roche; honoraria: Amgen, Eli Lilly, PharmaMar, Roche; research funding: AstraZeneca, Pfizer, PharmaMar, Roche; travel or accommodation expenses: Amgen, Eli Lilly, Illumina, PharmaMar, Roche. JSU is on the advisory board or has received honoraria and travel support from Amgen, Bristol Myers Squibb, GSK, Immunocore, LeoPharma, Merck Sharp and Dohme, Novartis, Pierre Fabre, Roche, Sanofi outside the submitted work ME has received honoraria and travel expenses from Novartis and Immunocore. SHo received travel support for participation in congresses, (speaker) honoraria and research grants from Almirall, UCB, Janssen, Novartis, LEO Pharma and Lilly outside of the present publication. SP has received travel support for participation in congresses and/or speaker honoraria from Abbvie, Lilly, MSD, Novartis, Pfizer and Sanofi outside of the present publication. SPo is on the advisory board or has received honoraria from Galenicum Derma, ISDIN, Cantabria Labs and Mesoestetic. RLB has received support from Castle Bioscience for the International Melanoma Pathology Study Group Symposium and Workshop. MG served as consultant to argenx (honoraria paid to institution) and Almirall and received honoraria for participation in advisory boards / travel support from Biotest, GSK, Janssen, Leo Pharma, Lilly, Novartis and UCB - all outside the scope of the submitted work. MVH received honoraria from MSD, BMS, Roche, Novartis, Sun Pharma, Sanofi, Almirall, Biofrontera, Galderma. The other authors declare no competing interests.

Figures

**Fig. 1. Overview of the XAI and reader study.**
a Schematic overview of our multimodal XAI. The AI system makes a prediction for each characteristic and then infers a melanoma diagnosis if it detects at least two melanoma characteristics. The diagnosis and corresponding explanations are then displayed to the clinician. b Schematic overview of our work. We first collected ground-truth annotations and corresponding ontology-based explanations for 3611 dermoscopic images from 14 international board-certified dermatologists and trained an explanatory AI on this dataset (top row). We then employed this classifier in a three-phase study (bottom row) involving 116 clinicians tasked with diagnosing dermoscopic images of melanomas and nevi. In phase 1 of the study, the clinicians received no AI assistance. In phase 2, they received the XAI’s predicted diagnoses but not its explanations. In phase 3, they received the predicted diagnoses along with the explanations. Figures created with BioRender.com.

**Fig. 2. Example multimodal XAI explanation.**
An example multimodal explanation from our XAI used in phase 3, showing a textual explanation (a) and the corresponding localised visual explanations (b). The XAI identified this lesion as a melanoma with the characteristics stated in the textual explanation. The white polygons represent the most important regions where the XAI detected the corresponding characteristics.

**Fig. 3. Overview of our XAI’s performance.**
a Ratio of mean Grad-CAM pixel activation value inside the lesion to that outside the lesion (P < 0.0001, two-sided Wilcoxon signed-rank test, n = 196 images). Higher values are better, as they indicate greater attention on regions within the lesion than on regions outside the lesion. Four data points for the baseline and 19 data points for the XAI have values above 300 and have been omitted to more clearly visualise the data. b We calculated the difference in output scores before and after obscuring the important pixels of the images (n = 200 images per threshold). Since we used a threshold on the Grad-CAM heatmaps, we calculated faithfulness values for each threshold ranging from 5 to 95. The stars represent the threshold used in our study and the values of faithfulness at this threshold. The transparent bands represent the 95% bootstrap confidence intervals. c Overlap in ontological explanations between clinician pairs for the same image compared to the overlap in ontological explanations between clinicians and our XAI. The whiskers are positioned close to zero and one, and the median lines are positioned close to zero, making them unnoticeable. Each value is shifted by a random number between −0.02 and 0.02 on the y-axis so that the points can be seen more clearly. The between-clinician category consists of n = 5165 clinician-pairs, whereas the clinician-XAI category comprises n = 1089 images. d Region of interest (ROI) overlap between clinicians and our XAI compared to that of the baseline (P < 0.0001, two-sided paired t test, n = 1120 images). For all boxplots, the horizontal line on each box denotes the median value and the white dot denotes the mean. The upper and lower box limits denote the 1st and 3rd quartiles, respectively, and the whiskers extend from the box to 1.5 times the interquartile range. Source data are provided as a Source Data file.

**Fig. 4. Impact of our XAI on clinicians’ diagnostic accuracy, confidence, and trust.**
a Distributions of clinicians’ balanced accuracy in each phase of our study. (P < 0.0001, two-sided paired t test, n = 109 participants (No AI vs. AI Support)), (P = 0.34, two-sided paired t test, n = 116 participants (AI Support vs XAI Support)) b Balanced diagnostic accuracy with AI and XAI support grouped by different levels of experience with dermoscopy (n = 116 participants). Distributions of clinicians’ mean diagnostic confidence (n = 116 participants) (c) and mean trust in the support system (n = 116 participants) (d) in each phase of our study. In figures a, c, d, the grey lines between the phases connect the same participant between phases, and the black lines connecting the boxes indicate the means across all participants. For all figures, the horizontal line on each box denotes the median value and the white dot denotes the mean. The upper and lower box limits denote the 1st and 3rd quartiles, respectively, and the whiskers extend from the box to 1.5 times the interquartile range. Source data are provided as a Source Data file.

**Fig. 5. Relationship between clinicians’ trust in AI and overlap in ontological explanations.**
a–c Correlation between overlap in reasoning (measured by Sørensen-Dice similarity coefficient) and trust in XAI for cases where the clinicians’ diagnoses matched those of the XAI (P = 0.01, Spearman’s rank correlation, n = 871 images). The left column depicts the relationship between overlap in reasoning and trust in XAI for both classes (a), the middle column depicts cases where both the clinicians and the XAI diagnosed melanoma (P < 0.0001, Spearman’s rank correlation, n = 567 images) (b), and the right column represents cases where they both diagnosed nevus (P = 0.01, Spearman’s rank correlation, n = 505 images) (c). Trust is measured on a Likert scale (1–10, with 1 meaning no trust and 10 meaning complete trust in the AI). Each data point is shifted by a random number between −0.02 and 0.02 on the y-axis and −0.1 and 0.1 on the x-axis so that the points can be seen more clearly. The light-coloured triangles connected by lines represent the means (calculated on non-shifted values) of each trust value and the transparent bands represent the 95% bootstrap confidence intervals. Source data are provided as a Source Data file.

See this image and copyright information in PMC

References

1. Maron, R. C. et al. Artificial Intelligence and its effect on dermatologists’ accuracy in dermoscopic melanoma image classification: web-based survey study. J. Med. Internet Res.22, e18091 (2020). 10.2196/18091 - DOI - PMC - PubMed
1. Tschandl, P. et al. Human–computer collaboration for skin cancer recognition. Nat. Med.26, 1229–1234 (2020). 10.1038/s41591-020-0942-0 - DOI - PubMed
1. Yu, K.-H., Beam, A. L. & Kohane, I. S. Artificial intelligence in healthcare. Nat. Biomed. Eng.2, 719–731 (2018). 10.1038/s41551-018-0305-z - DOI - PubMed
1. Goodman, B. & Flaxman, S. European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation”. AI Mag.38, 50–57 (2017).
1. Tonekaboni, S., Joshi, S., McCradden, M. D. & Goldenberg, A. What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use. In Proceedings of the 4th Machine Learning for Healthcare Conference 359–380 (PMLR, 2019).

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

Collaborators

Affiliations

Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical