Explainable AI improves task performance in human-AI collaboration
- PMID: 39730794
- PMCID: PMC11681242
- DOI: 10.1038/s41598-024-82501-9
Explainable AI improves task performance in human-AI collaboration
Abstract
Artificial intelligence (AI) provides considerable opportunities to assist human work. However, one crucial challenge of human-AI collaboration is that many AI algorithms operate in a black-box manner where the way how the AI makes predictions remains opaque. This makes it difficult for humans to validate a prediction made by AI against their own domain knowledge. For this reason, we hypothesize that augmenting humans with explainable AI improves task performance in human-AI collaboration. To test this hypothesis, we implement explainable AI in the form of visual heatmaps in inspection tasks conducted by domain experts. Visual heatmaps have the advantage that they are easy to understand and help to localize relevant parts of an image. We then compare participants that were either supported by (a) black-box AI or (b) explainable AI, where the latter supports them to follow AI predictions when the AI is accurate or overrule the AI when the AI predictions are wrong. We conducted two preregistered experiments with representative, real-world visual inspection tasks from manufacturing and medicine. The first experiment was conducted with factory workers from an electronics factory, who performed [Formula: see text] assessments of whether electronic products have defects. The second experiment was conducted with radiologists, who performed [Formula: see text] assessments of chest X-ray images to identify lung lesions. The results of our experiments with domain experts performing real-world tasks show that task performance improves when participants are supported by explainable AI with heatmaps instead of black-box AI. We find that explainable AI as a decision aid improved the task performance by 7.7 percentage points (95% confidence interval [CI]: 3.3% to 12.0%, [Formula: see text]) in the manufacturing experiment and by 4.7 percentage points (95% CI: 1.1% to 8.3%, [Formula: see text]) in the medical experiment compared to black-box AI. These gains represent a significant improvement in task performance.
Keywords: Decision-making; Explainable AI; Human-centered AI; Human–AI collaboration; Task performance.
© 2024. The Author(s).
Conflict of interest statement
Declarations. Competing interests: The authors declare no competing interests.
Figures



Similar articles
-
Effect of Uncertainty-Aware AI Models on Pharmacists' Reaction Time and Decision-Making in a Web-Based Mock Medication Verification Task: Randomized Controlled Trial.JMIR Med Inform. 2025 Apr 18;13:e64902. doi: 10.2196/64902. JMIR Med Inform. 2025. PMID: 40249341 Free PMC article. Clinical Trial.
-
A Machine Learning Approach with Human-AI Collaboration for Automated Classification of Patient Safety Event Reports: Algorithm Development and Validation Study.JMIR Hum Factors. 2024 Jan 25;11:e53378. doi: 10.2196/53378. JMIR Hum Factors. 2024. PMID: 38271086 Free PMC article.
-
Explainability does not mitigate the negative impact of incorrect AI advice in a personnel selection task.Sci Rep. 2024 Apr 28;14(1):9736. doi: 10.1038/s41598-024-60220-5. Sci Rep. 2024. PMID: 38679619 Free PMC article.
-
Explainable AI for Bioinformatics: Methods, Tools and Applications.Brief Bioinform. 2023 Sep 20;24(5):bbad236. doi: 10.1093/bib/bbad236. Brief Bioinform. 2023. PMID: 37478371 Review.
-
Current status and future directions of explainable artificial intelligence in medical imaging.Eur J Radiol. 2025 Feb;183:111884. doi: 10.1016/j.ejrad.2024.111884. Epub 2024 Dec 6. Eur J Radiol. 2025. PMID: 39667118 Review.
References
-
- Brynjolfsson, E. & Mitchell, T. What can machine learning do? Workforce implications. Science358, 1530–1534 (2017). - PubMed
-
- Perrault, R. & Clark, J. Artificial intelligence index report 2024. Human-centered artificial intelligence. United States of America. https://policycommons.net/artifacts/12089781/hai_ai-index-report-2024/12.... Accessed 26 Apr 2024. CID: 20.500.12592/h70s46h (2024).
-
- Bertolini, M., Mezzogori, D., Neroni, M. & Zammori, F. Machine learning for industrial applications: A comprehensive literature review. Expert Syst. Appl.175, 114820 (2021).
-
- Cazzaniga, M., Jaumotte, F., Li, L., Melina, G., Panton, A. J., Pizzinelli, C., Rockall, E. J. & Tavares, M. M. Gen-AI: Artificial intelligence and the future of work. International Monetary Fund. Staff Discussion Notes 2024/001. https://www.imf.org/en/Publications/Staff-Discussion-Notes/Issues/2024/0... (2024).
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources