Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 19;5(3):e39143.
doi: 10.2196/39143.

Improving Skin Color Diversity in Cancer Detection: Deep Learning Approach

Affiliations

Improving Skin Color Diversity in Cancer Detection: Deep Learning Approach

Eman Rezk et al. JMIR Dermatol. .

Abstract

Background: The lack of dark skin images in pathologic skin lesions in dermatology resources hinders the accurate diagnosis of skin lesions in people of color. Artificial intelligence applications have further disadvantaged people of color because those applications are mainly trained with light skin color images.

Objective: The aim of this study is to develop a deep learning approach that generates realistic images of darker skin colors to improve dermatology data diversity for various malignant and benign lesions.

Methods: We collected skin clinical images for common malignant and benign skin conditions from DermNet NZ, the International Skin Imaging Collaboration, and Dermatology Atlas. Two deep learning methods, style transfer (ST) and deep blending (DB), were utilized to generate images with darker skin colors using the lighter skin images. The generated images were evaluated quantitively and qualitatively. Furthermore, a convolutional neural network (CNN) was trained using the generated images to assess the latter's effect on skin lesion classification accuracy.

Results: Image quality assessment showed that the ST method outperformed DB, as the former achieved a lower loss of realism score of 0.23 (95% CI 0.19-0.27) compared to 0.63 (95% CI 0.59-0.67) for the DB method. In addition, ST achieved a higher disease presentation with a similarity score of 0.44 (95% CI 0.40-0.49) compared to 0.17 (95% CI 0.14-0.21) for the DB method. The qualitative assessment completed on masked participants indicated that ST-generated images exhibited high realism, whereby 62.2% (1511/2430) of the votes for the generated images were classified as real. Eight dermatologists correctly diagnosed the lesions in the generated images with an average rate of 0.75 (360 correct diagnoses out of 480) for several malignant and benign lesions. Finally, the classification accuracy and the area under the curve (AUC) of the model when considering the generated images were 0.76 (95% CI 0.72-0.79) and 0.72 (95% CI 0.67-0.77), respectively, compared to the accuracy of 0.56 (95% CI 0.52-0.60) and AUC of 0.63 (95% CI 0.58-0.68) for the model without considering the generated images.

Conclusions: Deep learning approaches can generate realistic skin lesion images that improve the skin color diversity of dermatology atlases. The diversified image bank, utilized herein to train a CNN, demonstrates the potential of developing generalizable artificial intelligence skin cancer diagnosis applications.

International registered report identifier (irrid): RR2-10.2196/34896.

Keywords: algorithm; artificial intelligence; cancer; computer-generated; data augmentation; deep learning; dermatology; diagnosis; diagnostic; digital health; generalizability; generated image; image generation; imaging; lesion; machine learning; neural network; skin; skin cancer diagnosis; skin tone diversity.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Style transfer (ST) in skin images. (A) VGG architecture. (B) Process of ST.
Figure 2
Figure 2
Skin tone classification. ITA: individual typology angle.
Figure 3
Figure 3
Image classification process. CNN: convolutional neural network; Tr: training set; Ts: test set; Vl: validation set.
Figure 4
Figure 4
Classification network. (A) ResNet-50 architecture and (B) the customized ResNet-50.
Figure 5
Figure 5
Generated images using style transfer (ST) and deep blending (DB) compared to the real images.
Figure 6
Figure 6
Generated score versus the real score. Line represents the linear regression model with the standard error shaded.
Figure 7
Figure 7
Evaluation of the human Visual Turing test results, with error bars representing 95% CI. FPR: false positive rate; TPR: true positive rate.
Figure 8
Figure 8
Recall of the utilized diseases, with error bars representing 95% CI. AK: actinic keratosis; AN: atypical nevi; BCC: basal cell carcinoma; IEC: intraepidermal carcinoma; HN: halo nevus; Hem: hemangioma; Mel: melanoma; SCC: squamous cell carcinoma; SK: seborrheic keratosis; VM: vascular malformation.
Figure 9
Figure 9
Confusion matrix of the real and generated images. (A) real images, (B) tan-generated images, (C) brown-generated images, and (D) dark-generated images.

References

    1. Tessier M. White lens of medicine: lack of diversity in dermatology hurts people of color. Ms Magazine. 2020. [2022-08-11]. https://msmagazine.com/2020/07/27/white-lens-of-medicine-lack-of-diversi...
    1. Adelekun A, Onyekaba G, Lipoff JB. Skin color in dermatology textbooks: An updated evaluation and analysis. J Am Acad Dermatol. 2021 Jan;84(1):194–196. doi: 10.1016/j.jaad.2020.04.084.S0190-9622(20)30700-3 - DOI - PubMed
    1. Marchetti MA, Liopyris K, Dusza SW, Codella NCF, Gutman DA, Helba B, Kalloo A, Halpern AC, International Skin Imaging Collaboration Computer algorithms show potential for improving dermatologists' accuracy to diagnose cutaneous melanoma: Results of the International Skin Imaging Collaboration 2017. J Am Acad Dermatol. 2020 Mar;82(3):622–627. doi: 10.1016/j.jaad.2019.07.016. https://europepmc.org/abstract/MED/31306724 S0190-9622(19)32373-4 - DOI - PMC - PubMed
    1. Haenssle HA, Fink C, Toberer F, Winkler J, Stolz W, Deinlein T, Hofmann-Wellenhof R, Lallas A, Emmert S, Buhl T, Zutt M, Blum A, Abassi MS, Thomas L, Tromme I, Tschandl P, Enk A, Rosenberger A, Reader Study Level I and Level II Groups Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann Oncol. 2020 Jan;31(1):137–143. doi: 10.1016/j.annonc.2019.10.013. https://linkinghub.elsevier.com/retrieve/pii/S0923-7534(19)35468-7 S0923-7534(19)35468-7 - DOI - PubMed
    1. Codella N. Rotemberg V. Tschandl P. Celebi M E. Dusza S. Gutman D. Helba B. Kalloo A. Liopyris K. Marchetti M. Kittler H. Halpern A Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (ISIC) arXiv. 2019. Mar, [2022-08-11]. http://arxiv.org/abs/1902.03368 .