Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 3;31(4):1009-1024.
doi: 10.1093/jamia/ocae015.

Question answering systems for health professionals at the point of care-a systematic review

Affiliations

Question answering systems for health professionals at the point of care-a systematic review

Gregory Kell et al. J Am Med Inform Assoc. .

Abstract

Objectives: Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement.

Materials and methods: We searched PubMed, IEEE Xplore, ACM Digital Library, ACL Anthology, and forward and backward citations on February 7, 2023. We included peer-reviewed journal and conference papers describing the design and evaluation of biomedical QA systems. Two reviewers screened titles, abstracts, and full-text articles. We conducted a narrative synthesis and risk of bias assessment for each study. We assessed the utility of biomedical QA systems.

Results: We included 79 studies and identified themes, including question realism, answer reliability, answer utility, clinical specialism, systems, usability, and evaluation methods. Clinicians' questions used to train and evaluate QA systems were restricted to certain sources, types and complexity levels. No system communicated confidence levels in the answers or sources. Many studies suffered from high risks of bias and applicability concerns. Only 8 studies completely satisfied any criterion for clinical utility, and only 7 reported user evaluations. Most systems were built with limited input from clinicians.

Discussion: While machine learning methods have led to increased accuracy, most studies imperfectly reflected real-world healthcare information needs. Key research priorities include developing more realistic healthcare QA datasets and considering the reliability of answer sources, rather than merely focusing on accuracy.

Keywords: artificial intelligence; clinical decision support; evidence-based medicine; natural language processing; question answering.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
PRISMA flow diagram.
Figure 2.
Figure 2.
Number of papers with each category of domain, method, question, answer source, and answer type. The distinction was made between a major category and all the others, as one main category tended to dominate several smaller others. Table 1 contains more detail on the specifics of each paper.
Figure 3.
Figure 3.
Number of papers achieving each risk of bias and applicability concern classification. Risk of bias refers to the risk of a divergence between the stated problem the paper tries to solve and the execution for reasons such as an unrealistic dataset or failing to split data for training and evaluation. Applicability refers to how applicable the system is to the review.
Figure 4.
Figure 4.
Number of papers achieving each satisfaction classification for each criterion.
Figure 5.
Figure 5.
Typical QA architecture as used by Alzubi et al.
Figure 6.
Figure 6.
Results of the BioASQ 5b and 6b challenges for factoid-type answers.

Similar articles

Cited by

References

    1. Del Fiol G, Workman TE, Gorman PN.. Clinical questions raised by clinicians at the point of care: a systematic review. JAMA Intern Med. 2014;174(5):710-718. - PubMed
    1. Bastian H, Glasziou P, Chalmers I.. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 2010;7(9):e1000326. - PMC - PubMed
    1. Hoogendam A, Stalenhoef AFH, Robbé PFdV, Overbeke AJPM.. Answers to questions posed during daily patient care are more likely to be answered by UpToDate than PubMed. J Med Internet Res. 2008;10(4):e29. - PMC - PubMed
    1. Hider P, Griffin G, Walker M, Coughlan E.. The information-seeking behavior of clinical staff in a large health care organization. J Med Libr Assoc. 2009;97(1):47-50. - PMC - PubMed
    1. Cao Y, Liu F, Simpson P, et al.AskHERMES: an online question answering system for complex clinical questions. J Biomed Inform. 2011;44(2):277-288. - PMC - PubMed

Publication types