Key challenges for delivering clinical impact with artificial intelligence

Christopher J Kelly¹, Alan Karthikesalingam², Mustafa Suleyman³, Greg Corrado⁴, Dominic King²

Affiliations

¹ Google Health, London, UK. cjkelly@google.com.
² Google Health, London, UK.
³ DeepMind, London, UK.
⁴ Google Health, California, USA.

PMID: 31665002
PMCID: PMC6821018
DOI: 10.1186/s12916-019-1426-2

Key challenges for delivering clinical impact with artificial intelligence

Christopher J Kelly et al. BMC Med. 2019.

. 2019 Oct 29;17(1):195.

doi: 10.1186/s12916-019-1426-2.

Authors

Christopher J Kelly¹, Alan Karthikesalingam², Mustafa Suleyman³, Greg Corrado⁴, Dominic King²

Affiliations

¹ Google Health, London, UK. cjkelly@google.com.
² Google Health, London, UK.
³ DeepMind, London, UK.
⁴ Google Health, California, USA.

PMID: 31665002
PMCID: PMC6821018
DOI: 10.1186/s12916-019-1426-2

Abstract

Background: Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice.

Main body: Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes.

Conclusion: The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.

Keywords: Algorithms; Artificial intelligence; Evaluation; Machine learning; Regulation; Translation.

PubMed Disclaimer

Conflict of interest statement

All authors are employed by Google LLC.

References

1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56. doi: 10.1038/s41591-018-0300-7. - DOI - PubMed
1. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25:24–29. doi: 10.1038/s41591-018-0316-z. - DOI - PubMed
1. Berwick DM, Nolan TW, Whittington J. The triple aim: care, health, and cost. Health Aff. 2008;27:759–769. doi: 10.1377/hlthaff.27.3.759. - DOI - PubMed
1. Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12:573–576. doi: 10.1370/afm.1713. - DOI - PMC - PubMed
1. Hwang EJ, Park S, Jin K-N, Kim JI, Choi SY, Lee JH, et al. Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open. 2019;2:e191095. doi: 10.1001/jamanetworkopen.2019.1095. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- ClinicalTrials.gov
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Key challenges for delivering clinical impact with artificial intelligence

Affiliations

Key challenges for delivering clinical impact with artificial intelligence

Authors

Affiliations

Abstract

Conflict of interest statement

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical