Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 1;160(3):303-311.
doi: 10.1001/jamadermatol.2023.5550.

Federated Learning for Decentralized Artificial Intelligence in Melanoma Diagnostics

Affiliations

Federated Learning for Decentralized Artificial Intelligence in Melanoma Diagnostics

Sarah Haggenmüller et al. JAMA Dermatol. .

Abstract

Importance: The development of artificial intelligence (AI)-based melanoma classifiers typically calls for large, centralized datasets, requiring hospitals to give away their patient data, which raises serious privacy concerns. To address this concern, decentralized federated learning has been proposed, where classifier development is distributed across hospitals.

Objective: To investigate whether a more privacy-preserving federated learning approach can achieve comparable diagnostic performance to a classical centralized (ie, single-model) and ensemble learning approach for AI-based melanoma diagnostics.

Design, setting, and participants: This multicentric, single-arm diagnostic study developed a federated model for melanoma-nevus classification using histopathological whole-slide images prospectively acquired at 6 German university hospitals between April 2021 and February 2023 and benchmarked it using both a holdout and an external test dataset. Data analysis was performed from February to April 2023.

Exposures: All whole-slide images were retrospectively analyzed by an AI-based classifier without influencing routine clinical care.

Main outcomes and measures: The area under the receiver operating characteristic curve (AUROC) served as the primary end point for evaluating the diagnostic performance. Secondary end points included balanced accuracy, sensitivity, and specificity.

Results: The study included 1025 whole-slide images of clinically melanoma-suspicious skin lesions from 923 patients, consisting of 388 histopathologically confirmed invasive melanomas and 637 nevi. The median (range) age at diagnosis was 58 (18-95) years for the training set, 57 (18-93) years for the holdout test dataset, and 61 (18-95) years for the external test dataset; the median (range) Breslow thickness was 0.70 (0.10-34.00) mm, 0.70 (0.20-14.40) mm, and 0.80 (0.30-20.00) mm, respectively. The federated approach (0.8579; 95% CI, 0.7693-0.9299) performed significantly worse than the classical centralized approach (0.9024; 95% CI, 0.8379-0.9565) in terms of AUROC on a holdout test dataset (pairwise Wilcoxon signed-rank, P < .001) but performed significantly better (0.9126; 95% CI, 0.8810-0.9412) than the classical centralized approach (0.9045; 95% CI, 0.8701-0.9331) on an external test dataset (pairwise Wilcoxon signed-rank, P < .001). Notably, the federated approach performed significantly worse than the ensemble approach on both the holdout (0.8867; 95% CI, 0.8103-0.9481) and external test dataset (0.9227; 95% CI, 0.8941-0.9479).

Conclusions and relevance: The findings of this diagnostic study suggest that federated learning is a viable approach for the binary classification of invasive melanomas and nevi on a clinically representative distributed dataset. Federated learning can improve privacy protection in AI-based melanoma diagnostics while simultaneously promoting collaboration across institutions and countries. Moreover, it may have the potential to be extended to other image classification tasks in digital cancer histopathology and beyond.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Ms Haggenmüller reported grants from Federal Ministry of Health, Berlin, Germany (grants: Skin Classification Project 2 [SCP2] and Tumor Behavior Prediction Initiative [TPI]; grant holder in both cases: Titus J. Brinker, German Cancer Research Center, Heidelberg, Germany) during the conduct of the study. Dr Krieghoff-Henning reported grants from German Federal Ministry of Health during the conduct of the study. Mr Hekler reported grants from German Federal Ministry of Health during the conduct of the study. Mr Maron reported grants from German Federal Ministry of Health during the conduct of the study. Prof Utikal reported personal fees from Amgen, Bristol Myers Squibb, GSK, Immunocore, LEO Pharma, Merck Sharp & Dohme, Novartis, Pierre Fabre, Roche, and Sanofi outside the submitted work. Prof Meier reported grants from Novartis and Roche; other (travel support or/and speaker’s fees or/and advisor’s honoraria) from BMS, MSD, and Pierre Fabre outside the submitted work. Dr Hobelsberger reported clinical trial support from Almirall and speaker’s honoraria from Almirall, UCB, and AbbVie and travel support from UCB, Janssen Cilag, Almirall, Novartis, Lilly, LEO Pharma, and AbbVie outside the submitted work. Prof Heinzerling reported other (clinical studies) from BMS, MSD, Pierre Fabre, Replimune, and Sanofi; personal fees from Biomedx, BMS, MSD, Sun, Pierre Fabre, Novartis, and Sanofi; and grants from Therakos outside the submitted work. Dr Schlaak reported personal fees from BMS, Novartis, Immunocore, Sun Pharma, MSD, Recordati, and Sanofi Aventis outside the submitted work. Prof Berking reported grants from BMG Bundesministerium für Gesundheit to institute during the conduct of the study; personal fees from BMS, MSD, InflaRx, Novartis, Sanofi, LEO Pharma, Almirall Hermal, Pierre Fabre, Immunocore, and Delcath outside the submitted work. Dr Sondermann reported grants from Almirall and Medi GmbH; and personal fees from AbbVie, BMS, Boehringer Ingelheim, Celgene, Janssen, LEO Pharma, Lilly, Novartis, Pfizer, Sanofi Genzyme, and UCB outside the submitted work. Prof Goebeler reported grants from DKFZ Heidelberg during the conduct of the study; grants (clinical study) from Argenx, Novartis, Janssen, and Galderma; personal fees from Almirall (consulting), Janssen (advisory board, speaker), GSK (advisory board, speaker), and Lilly (speaker) outside the submitted work. Prof Kather reported personal fees from Owkin, Panakeia, DoMore Diagnostics, Histofy, Roche, MSD, BMS, Eisai, Bayer, Fresenius, and Pfizer outside the submitted work. Dr Brinker reported being owner of Smart Health Heidelberg GmbH outside the submitted work. No other disclosures were reported.

Figures

Figure 1.
Figure 1.. Flowchart of the Slide Inclusion Process
Slides were excluded from the analysis if there was no histopathologically confirmed label available or if the lesion proved to be neither invasive melanoma (IM) nor nevus (in situ tumors or other diagnoses, eg, basal cell carcinoma, squamous cell carcinoma). In addition, slides that exhibited fewer than 50 epidermal patches or other technical issues were removed.
Figure 2.
Figure 2.. Mean Area Under the Receiver Operating Characteristic Curve (AUROC) of the 3 Investigated Approaches
Mean AUROCs on the holdout and external test dataset after 1000 iterations of bootstrapping, including the corresponding 95% CIs (shaded areas), are illustrated for the federated learning (FL) and the centralized approach (model Hfull) (A and B) and for the FL and the ensemble approach (C and D). AUC indicates area under the curve.

Comment in

References

    1. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi: 10.1038/s41586-019-1799-6 - DOI - PubMed
    1. Bulten W, Kartasalo K, Chen PC, et al. ; PANDA challenge consortium . Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat Med. 2022;28(1):154-163. doi: 10.1038/s41591-021-01620-2 - DOI - PMC - PubMed
    1. Mei X, Lee HC, Diao KY, et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nat Med. 2020;26(8):1224-1228. doi: 10.1038/s41591-020-0931-3 - DOI - PMC - PubMed
    1. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi: 10.1038/nature21056 - DOI - PMC - PubMed
    1. Haggenmüller S, Maron RC, Hekler A, et al. Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts. Eur J Cancer. 2021;156:202-216. doi: 10.1016/j.ejca.2021.06.049 - DOI - PubMed

Publication types