Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians
- PMID: 38155295
- DOI: 10.1038/s41551-023-01160-9
Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians
Abstract
The inferences of most machine-learning models powering medical artificial intelligence are difficult to interpret. Here we report a general framework for model auditing that combines insights from medical experts with a highly expressive form of explainable artificial intelligence. Specifically, we leveraged the expertise of dermatologists for the clinical task of differentiating melanomas from melanoma 'lookalikes' on the basis of dermoscopic and clinical images of the skin, and the power of generative models to render 'counterfactual' images to understand the 'reasoning' processes of five medical-image classifiers. By altering image attributes to produce analogous images that elicit a different prediction by the classifiers, and by asking physicians to identify medically meaningful features in the images, the counterfactual images revealed that the classifiers rely both on features used by human dermatologists, such as lesional pigmentation patterns, and on undesirable features, such as background skin texture and colour balance. The framework can be applied to any specialized medical domain to make the powerful inference processes of machine-learning models medically understandable.
© 2023. The Author(s), under exclusive licence to Springer Nature Limited.
Conflict of interest statement
Competing interests: R.D. reports fees from L’Oreal, Frazier Healthcare Partners, Pfizer, DWA and VisualDx for consulting; stock options from MDAcne and Revea for advisory board; and research funding from UCB. The other authors declare no competing interests.
References
-
- DeGrave, A. J., Janizek, J. D. & Lee, S.-I. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3, 610–619 (2021). - DOI
-
- Singh, N. et al. Agreement between saliency maps and human-labeled regions of interest: applications to skin disease classification. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 3172–3181 (IEEE, 2020).
MeSH terms
Grants and funding
- R01 AG061132/AG/NIA NIH HHS/United States
- DBI-1759487/National Science Foundation (NSF)
- R35 GM 128638/U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- 5T32 AR007422-38/U.S. Department of Health & Human Services | National Institutes of Health (NIH)
LinkOut - more resources
Full Text Sources
Medical
