A reinforcement learning model for AI-based decision support in skin cancer

Catarina Barata^#¹, Veronica Rotemberg^#², Noel C F Codella³, Philipp Tschandl⁴, Christoph Rinner⁵, Bengu Nisa Akay⁶, Zoe Apalla⁷, Giuseppe Argenziano⁸, Allan Halpern², Aimilios Lallas⁷, Caterina Longo^{9

10}, Josep Malvehy^{11

12}, Susana Puig^{11

12}, Cliff Rosendahl¹³, H Peter Soyer¹⁴, Iris Zalaudek¹⁵, Harald Kittler¹⁶

Affiliations

¹ Institute for Systems and Robotics, LARSyS, Instituto Superior Técnico, Lisbon, Portugal.
² Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
³ Microsoft, Redmond, WA, USA.
⁴ Department of Dermatology, Medical University of Vienna, Vienna, Austria.
⁵ Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Vienna, Austria.
⁶ Ankara University School of Medicine, Department of Dermatology, Ankara, Turkey.
⁷ Second Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece.
⁸ Dermatology Unit, University of Campania, Naples, Italy.
⁹ Dermatology Unit, University of Modena and Reggio Emilia, Modena, Italy.
¹⁰ Azienda Unità Sanitaria Locale - IRCCS di Reggio Emilia, Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Reggio Emilia, Italy.
¹¹ Melanoma Unit, Dermatology Department, Hospital Clínic Barcelona, Universitat de Barcelona, IDIBAPS, Barcelona, Spain.
¹² Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain.
¹³ General Practice Clinical Unit, Medical School, The University of Queensland, Brisbane, Queensland, Australia.
¹⁴ Frazer Institute, The University of Queensland, Dermatology Research Centre, Brisbane, Queensland, Australia.
¹⁵ Department of Dermatology, Medical University of Trieste, Trieste, Italy.
¹⁶ Department of Dermatology, Medical University of Vienna, Vienna, Austria. harald.kittler@meduniwien.ac.at.

^# Contributed equally.

PMID: 37501017
PMCID: PMC10427421
DOI: 10.1038/s41591-023-02475-5

A reinforcement learning model for AI-based decision support in skin cancer

Catarina Barata et al. Nat Med. 2023 Aug.

. 2023 Aug;29(8):1941-1946.

doi: 10.1038/s41591-023-02475-5. Epub 2023 Jul 27.

Authors

Affiliations

¹ Institute for Systems and Robotics, LARSyS, Instituto Superior Técnico, Lisbon, Portugal.
² Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
³ Microsoft, Redmond, WA, USA.
⁴ Department of Dermatology, Medical University of Vienna, Vienna, Austria.
⁵ Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Vienna, Austria.
⁶ Ankara University School of Medicine, Department of Dermatology, Ankara, Turkey.
⁷ Second Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece.
⁸ Dermatology Unit, University of Campania, Naples, Italy.
⁹ Dermatology Unit, University of Modena and Reggio Emilia, Modena, Italy.
¹⁰ Azienda Unità Sanitaria Locale - IRCCS di Reggio Emilia, Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Reggio Emilia, Italy.
¹¹ Melanoma Unit, Dermatology Department, Hospital Clínic Barcelona, Universitat de Barcelona, IDIBAPS, Barcelona, Spain.
¹² Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain.
¹³ General Practice Clinical Unit, Medical School, The University of Queensland, Brisbane, Queensland, Australia.
¹⁴ Frazer Institute, The University of Queensland, Dermatology Research Centre, Brisbane, Queensland, Australia.
¹⁵ Department of Dermatology, Medical University of Trieste, Trieste, Italy.
¹⁶ Department of Dermatology, Medical University of Vienna, Vienna, Austria. harald.kittler@meduniwien.ac.at.

^# Contributed equally.

PMID: 37501017
PMCID: PMC10427421
DOI: 10.1038/s41591-023-02475-5

Abstract

We investigated whether human preferences hold the potential to improve diagnostic artificial intelligence (AI)-based decision support using skin cancer diagnosis as a use case. We utilized nonuniform rewards and penalties based on expert-generated tables, balancing the benefits and harms of various diagnostic errors, which were applied using reinforcement learning. Compared with supervised learning, the reinforcement learning model improved the sensitivity for melanoma from 61.4% to 79.5% (95% confidence interval (CI): 73.5-85.6%) and for basal cell carcinoma from 79.4% to 87.1% (95% CI: 80.3-93.9%). AI overconfidence was also reduced while simultaneously maintaining accuracy. Reinforcement learning increased the rate of correct diagnoses made by dermatologists by 12.0% (95% CI: 8.8-15.1%) and improved the rate of optimal management decisions from 57.4% to 65.3% (95% CI: 61.7-68.9%). We further demonstrated that the reward-adjusted reinforcement learning model and a threshold-based model outperformed naïve supervised learning in various clinical scenarios. Our findings suggest the potential for incorporating human preferences into image-based diagnostic algorithms.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing interests: P.T. has received fees from Silverchair, speaker honoraria from FotoFinder, Lilly and Novartis, and an unrestricted one-year postdoc grant from MetaOptima Technology Inc. N.C. is a Microsoft employee and owns diverse investments across technology and healthcare companies. A.H. is a consultant to Canfield Scientific Inc. and advisory board member of Scibase AB. H.P.S. is a shareholder of MoleMap NZ Limited and e-derm consult GmbH and undertakes regular teledermatological reporting for both companies. H.P.S. is also a medical consultant for Canfield Scientific Inc., MoleMap Australia Pty Ltd, Blaze Bioscience Inc. and a medical adviser for First Derm. V.R. is a medical adviser for Inhabit Brands, Inc. H.K. received nonfinancial support from Derma Medical Systems, Fotofinder and Heine, and speaker fees from Fotofinder. The remaining authors declare no competing interests.

Figures

**Fig. 1. Comparison of models and reader study results.**
a, Expert-generated reward table used to train the RL model; rows, ground truth; columns, predictions. b,c, Confusion matrix of the SL model (b) and the RL model (c) using the same test set (n = 1511). Rows, ground truth; columns, predictions. The proportions are normalized by the row-sums (MEL: n = 171; BCC: n = 93; AKIEC: n = 43; BKL: n = 217; NV: n = 908; DF: n = 44; VASC: n = 35). d, Boxplot of difference in entropy of paired test set predictions (n = 1,511) of the SL model and the RL model. Black line, median; boxes, 25th–75th percentiles; whiskers, minimum and maximum values, P < 0.0001 (Wilcoxon test). e,f, Results of the reader study comparing sensitivities (e) and frequencies of optimal management decisions (f) of 89 dermatologists by diagnosis without AI support (−AI), with support by the SL model (+SL) and with support by the RL model (+RL). Optimal managements: ‘excision’ for melanomas and basal cell carcinomas; ‘local therapy’ for actinic keratoses/intraepidermal carcinoma; and ‘dismiss’ for nevi, benign keratinocytic lesions, dermatofibroma and vascular lesions. Bars, means; whiskers, standard error. Sample sizes: MEL(−AI): n = 89; MEL(+SL): n = 78; MEL(+RL): n = 81; BCC(−AI): n = 89; BCC(+SL): n = 63; BCC(+RL): n = 68; AKIEC(−AI): n = 89; AKIEC(+SL): n = 60; AKIEC(+RL): n = 72; NV(−AI): n = 89; NV(+SL): n = 88; NV(+RL): n = 85; BKL(−AI): n = 89; BKL(+SL): n = 65; BKL(+RL): n = 76; DF(−AI): n = 89; DF(+SL): n = 71; DF(+RL): n = 61; VASC(−AI): n = 89, VASC(+SL): n = 67; VASC(+RL): n = 65. Abbreviations: MEL, melanoma; BCC, basal cell carcinoma; AKIEC, actinic keratosis/intraepidermal carcinoma; BKL, benign keratinocytic lesion; NV, melanocytic nevus; DF, dermatofibroma; VASC, vascular lesion.

**Fig. 2. Comparison of models in three different scenarios.**
Top level (binary scenario: benign versus malignant): a, Experts’ malignancy probability thresholds for decision to excise (n = 10). Lines, median; boxes, 25th–75th percentiles; whiskers, values within 1.5 times interquartile range. b, Receiver operating characteristic curve derived from the SL model and operating points of ten experts using either thresholds (SL model) or rewards (RL model). Possible management decisions were ‘dismiss’ or ‘excise’. True and false positive rates refer to proportions of malignant and benign lesions that were excised. Black triangle, naïve approach (excision if malignant probability > 0.5). c, Boxplot comparing TPRs for melanomas applying thresholds (SL model) and rewards (RL model) provided by ten experts. Bars, means; whiskers, standard deviations (P = 0.11, paired t-test); dashed line, naïve approach. Middle level (multiclass scenario, additional therapeutic option): d, Thresholds of ten experts for probabilities of actinic keratosis/intraepidermal carcinoma for decision to treat locally. Line, median; boxes, 25th–75th percentiles; whiskers, values within 1.5 times interquartile range. e, Median rewards per action and diagnosis. f–h, Confusion matrices of actions by diagnosis: naïve approach (f), threshold-adjusted SL model (g) and RL model (h). Lower level (patient-centered approach, 7,375 lesions, 524 patients): i, Thresholds of ten experts for malignancy probabilities for decision to dismiss, monitor or excise. j, Median rewards per action and diagnosis. k, Number of excisions of benign lesions by patient according to model. l, Number of monitored benign lesions by patient according to model. m, Management strategies for 55 melanomas according to model.

**Extended Data Fig. 1. Comparison of baseline SL model with RL model.**
a: Alluvial plot of test set (n = 1511); the left block shows the ground truth, the middle block shows the results of supervised learning (SL), and the right block shows the results of reinforcement learning (RL) based on a reward table created by experts; Only alluvials with n > 5 are shown. MEL= melanoma (n = 171), BCC= basal cell carcinoma (n = 93), AKIEC= actinic keratosis and intraepidermal carcinoma(n = 43), BKL= benign keratinocytic lesion (n = 217), NV= melanocytic nevus (n = 908), DF=dermatofibroma (n = 44), VASC= vascular lesion (n = 35). b: Boxplots of entropy of correct and incorrect predictions for melanoma (n = 171) and melanocytic nevi (n = 908) according to applied model. Black line = median, boxes = 25th–75th percentiles, whiskers = values within 1.5 times interquartile range; Abbreviations: SL= supervised learning, RL= reinforcement learning, dx =ground truth.

**Extended Data Fig. 2. Scenario with 7 diagnoses and ‘local therapy’ as an additional treatment option.**
a: Graphical abstract of scenario adding the treatment option ‘local therapy’ (for example cryotherapy) for actinic keratosis/intraepidermal carcinomas. While excision is the optimal management for melanoma and most basal cell carcinomas, local therapy is optimal for actinic keratosis/intraepidermal carcinoma. We judged local therapy to be a harmful treatment for melanomas and suboptimal for basal cell carcinomas suitable for surgery (all basal cell carcinomas in the dataset). b: Proportion of cases per diagnosis and model that received optimal management (excision for melanoma and basal carcinoma, local therapy for actinic keratoses/intraepidermal carcinoma, and no treatment (‘dismiss’) for all benign diagnoses); c: Proportion of cases per diagnosis and model that were mismanaged. Mismanagement included all procedures except excision for melanoma and basal cell carcinoma, all procedures except excision or local therapy for actinic keratoses/intraepidermal carcinoma, and all procedures except ‘dismiss’ for all benign conditions (nevus, benign keratinocytic lesions, dermatofibroma, and vascular lesions). Abbreviations and sample size: mel= melanoma (n = 171), bcc= basal cell carcinoma (n = 93), akiec= actinic keratosis/intraepidermal carcinoma(n = 43), bkl= benign keratinocytic lesion (n = 217), nv= nevus (n = 908), df=dermatofibroma (n = 44), vasc= vascular lesion (n = 35).

**Extended Data Fig. 3. Scenario of high-risk patients with multiple nevi.**
a: Graphical abstract of scenario of monitoring of high-risk individuals with multiple nevi. Due to the large number of lesions per patient, this scenario requires a more patient-centered and less lesion-centered approach. Most melanomas detected during monitoring are noninvasive, slow-growing lesions. Short-term monitoring of these melanomas, while not optimal, is considered acceptable. b: Malignancy probability predictions of the baseline SL model according to management predictions of the RL model for benign lesions (n = 7320) and melanomas (n = 55). The red dashed horizontal line indicates the median value of the melanoma probability selected by 10 experts as threshold for excision. The black dashed horizontal line indicates the minimum value. Black line = median, boxes = 25th–75th percentiles, whiskers = values within 1.5 times the interquartile range.

See this image and copyright information in PMC

References

1. Esteva A, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. - DOI - PMC - PubMed
1. Tschandl P, et al. Human-computer collaboration for skin cancer recognition. Nat. Med. 2020 doi: 10.1038/s41591-020-0942-0. - DOI - PubMed
1. Tschandl P, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20:938–947. doi: 10.1016/S1470-2045(19)30333-X. - DOI - PMC - PubMed
1. Haenssle HA, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 2018;29:1836–1842. doi: 10.1093/annonc/mdy166. - DOI - PubMed
1. McKinney SM, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577:89–94. doi: 10.1038/s41586-019-1799-6. - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A reinforcement learning model for AI-based decision support in skin cancer

Affiliations

A reinforcement learning model for AI-based decision support in skin cancer

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical