Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study
- PMID: 31201137
- PMCID: PMC8237239
- DOI: 10.1016/S1470-2045(19)30333-X
Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study
Abstract
Background: Whether machine-learning algorithms can diagnose all pigmented skin lesions as accurately as human experts is unclear. The aim of this study was to compare the diagnostic accuracy of state-of-the-art machine-learning algorithms with human readers for all clinically relevant types of benign and malignant pigmented skin lesions.
Methods: For this open, web-based, international, diagnostic study, human readers were asked to diagnose dermatoscopic images selected randomly in 30-image batches from a test set of 1511 images. The diagnoses from human readers were compared with those of 139 algorithms created by 77 machine-learning labs, who participated in the International Skin Imaging Collaboration 2018 challenge and received a training set of 10 015 images in advance. The ground truth of each lesion fell into one of seven predefined disease categories: intraepithelial carcinoma including actinic keratoses and Bowen's disease; basal cell carcinoma; benign keratinocytic lesions including solar lentigo, seborrheic keratosis and lichen planus-like keratosis; dermatofibroma; melanoma; melanocytic nevus; and vascular lesions. The two main outcomes were the differences in the number of correct specific diagnoses per batch between all human readers and the top three algorithms, and between human experts and the top three algorithms.
Findings: Between Aug 4, 2018, and Sept 30, 2018, 511 human readers from 63 countries had at least one attempt in the reader study. 283 (55·4%) of 511 human readers were board-certified dermatologists, 118 (23·1%) were dermatology residents, and 83 (16·2%) were general practitioners. When comparing all human readers with all machine-learning algorithms, the algorithms achieved a mean of 2·01 (95% CI 1·97 to 2·04; p<0·0001) more correct diagnoses (17·91 [SD 3·42] vs 19·92 [4·27]). 27 human experts with more than 10 years of experience achieved a mean of 18·78 (SD 3·15) correct answers, compared with 25·43 (1·95) correct answers for the top three machine algorithms (mean difference 6·65, 95% CI 6·06-7·25; p<0·0001). The difference between human experts and the top three algorithms was significantly lower for images in the test set that were collected from sources not included in the training set (human underperformance of 11·4%, 95% CI 9·9-12·9 vs 3·6%, 0·8-6·3; p<0·0001).
Interpretation: State-of-the-art machine-learning classifiers outperformed human experts in the diagnosis of pigmented skin lesions and should have a more important role in clinical practice. However, a possible limitation of these algorithms is their decreased performance for out-of-distribution images, which should be addressed in future research.
Funding: None.
Copyright © 2019 Elsevier Ltd. All rights reserved.
Figures





Comment in
-
Machine versus man in skin cancer diagnosis.Lancet Oncol. 2019 Jul;20(7):891-892. doi: 10.1016/S1470-2045(19)30391-2. Epub 2019 Jun 11. Lancet Oncol. 2019. PMID: 31201138 No abstract available.
Similar articles
-
Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 International Skin Imaging Collaboration Grand Challenge.Lancet Digit Health. 2022 May;4(5):e330-e339. doi: 10.1016/S2589-7500(22)00021-8. Lancet Digit Health. 2022. PMID: 35461690 Free PMC article.
-
Role of In Vivo Reflectance Confocal Microscopy in the Analysis of Melanocytic Lesions.Acta Dermatovenerol Croat. 2018 Apr;26(1):64-67. Acta Dermatovenerol Croat. 2018. PMID: 29782304 Review.
-
Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images.J Am Acad Dermatol. 2018 Feb;78(2):270-277.e1. doi: 10.1016/j.jaad.2017.08.016. Epub 2017 Sep 29. J Am Acad Dermatol. 2018. PMID: 28969863 Free PMC article.
-
Design and validation of a new machine-learning-based diagnostic tool for the differentiation of dermatoscopic skin cancer images.PLoS One. 2023 Apr 14;18(4):e0284437. doi: 10.1371/journal.pone.0284437. eCollection 2023. PLoS One. 2023. PMID: 37058446 Free PMC article.
-
Dermatoscopy of Neoplastic Skin Lesions: Recent Advances, Updates, and Revisions.Curr Treat Options Oncol. 2018 Sep 20;19(11):56. doi: 10.1007/s11864-018-0573-6. Curr Treat Options Oncol. 2018. PMID: 30238167 Free PMC article. Review.
Cited by
-
Using Artificial Intelligence as a Diagnostic Decision Support Tool in Skin Disease: Protocol for an Observational Prospective Cohort Study.JMIR Res Protoc. 2022 Aug 31;11(8):e37531. doi: 10.2196/37531. JMIR Res Protoc. 2022. PMID: 36044249 Free PMC article.
-
High-fidelity detection, subtyping, and localization of five skin neoplasms using supervised and semi-supervised learning.J Pathol Inform. 2022 Nov 26;14:100159. doi: 10.1016/j.jpi.2022.100159. eCollection 2023. J Pathol Inform. 2022. PMID: 36506813 Free PMC article.
-
Differentiating Malignant from Benign Pigmented or Non-Pigmented Skin Tumours-A Pilot Study on 3D Hyperspectral Imaging of Complex Skin Surfaces and Convolutional Neural Networks.J Clin Med. 2022 Mar 30;11(7):1914. doi: 10.3390/jcm11071914. J Clin Med. 2022. PMID: 35407522 Free PMC article.
-
The Possibility of Deep Learning-Based, Computer-Aided Skin Tumor Classifiers.Front Med (Lausanne). 2019 Aug 27;6:191. doi: 10.3389/fmed.2019.00191. eCollection 2019. Front Med (Lausanne). 2019. PMID: 31508420 Free PMC article. Review.
-
Teledermoscopy in the Diagnosis of Melanocytic and Non-Melanocytic Skin Lesions: NurugoTM Derma Smartphone Microscope as a Possible New Tool in Daily Clinical Practice.Diagnostics (Basel). 2022 Jun 2;12(6):1371. doi: 10.3390/diagnostics12061371. Diagnostics (Basel). 2022. PMID: 35741181 Free PMC article.
References
-
- Saphier J Die Dermatoskopie. Arch f Dermat 1921; 128: 1–19.
-
- Kittler H, Pehamberger H, Wolff K, Binder M. Diagnostic accuracy of dermoscopy. Lancet Oncol 2002; 3: 159–65. - PubMed
-
- Forsea AM, Tschandl P, Del Marmol V, et al. Factors driving the use of dermoscopy in Europe: a pan-European survey. Br J Dermatol 2016; 175: 1329–37. - PubMed
-
- Rosendahl C, Williams G, Eley D, et al. The impact of subspecialization and dermatoscopy use on accuracy of melanoma diagnosis among primary care doctors in Australia. J Am Acad Dermatol 2012; 67: 846–52. - PubMed
-
- Rogers HW, Weinstock MA, Feldman SR, Coldiron BM. Incidence estimate of nonmelanoma skin cancer (keratinocyte carcinomas) in the US population, 2012. JAMA Dermatol 2015; 151: 1081–86. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical