Retrieval-Based Diagnostic Decision Support: Mixed Methods Study
- PMID: 38896468
- PMCID: PMC11222760
- DOI: 10.2196/50209
Retrieval-Based Diagnostic Decision Support: Mixed Methods Study
Abstract
Background: Diagnostic errors pose significant health risks and contribute to patient mortality. With the growing accessibility of electronic health records, machine learning models offer a promising avenue for enhancing diagnosis quality. Current research has primarily focused on a limited set of diseases with ample training data, neglecting diagnostic scenarios with limited data availability.
Objective: This study aims to develop an information retrieval (IR)-based framework that accommodates data sparsity to facilitate broader diagnostic decision support.
Methods: We introduced an IR-based diagnostic decision support framework called CliniqIR. It uses clinical text records, the Unified Medical Language System Metathesaurus, and 33 million PubMed abstracts to classify a broad spectrum of diagnoses independent of training data availability. CliniqIR is designed to be compatible with any IR framework. Therefore, we implemented it using both dense and sparse retrieval approaches. We compared CliniqIR's performance to that of pretrained clinical transformer models such as Clinical Bidirectional Encoder Representations from Transformers (ClinicalBERT) in supervised and zero-shot settings. Subsequently, we combined the strength of supervised fine-tuned ClinicalBERT and CliniqIR to build an ensemble framework that delivers state-of-the-art diagnostic predictions.
Results: On a complex diagnosis data set (DC3) without any training data, CliniqIR models returned the correct diagnosis within their top 3 predictions. On the Medical Information Mart for Intensive Care III data set, CliniqIR models surpassed ClinicalBERT in predicting diagnoses with <5 training samples by an average difference in mean reciprocal rank of 0.10. In a zero-shot setting where models received no disease-specific training, CliniqIR still outperformed the pretrained transformer models with a greater mean reciprocal rank of at least 0.10. Furthermore, in most conditions, our ensemble framework surpassed the performance of its individual components, demonstrating its enhanced ability to make precise diagnostic predictions.
Conclusions: Our experiments highlight the importance of IR in leveraging unstructured knowledge resources to identify infrequently encountered diagnoses. In addition, our ensemble framework benefits from combining the complementary strengths of the supervised and retrieval-based models to diagnose a broad spectrum of diseases.
Keywords: EHR; RAG; clinical decision support; data sparsity; electronic health record; electronic health records; ensemble learning; information retrieval; machine learning; natural language processing; rare diseases; retrieval augmented generation; retrieval-augmented learning.
©Tassallah Abdullahi, Laura Mercurio, Ritambhara Singh, Carsten Eickhoff. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 19.06.2024.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures






Similar articles
-
Identification of Semantically Similar Sentences in Clinical Notes: Iterative Intermediate Training Using Multi-Task Learning.JMIR Med Inform. 2020 Nov 27;8(11):e22508. doi: 10.2196/22508. JMIR Med Inform. 2020. PMID: 33245284 Free PMC article.
-
Disease Concept-Embedding Based on the Self-Supervised Method for Medical Information Extraction from Electronic Health Records and Disease Retrieval: Algorithm Development and Validation Study.J Med Internet Res. 2021 Jan 27;23(1):e25113. doi: 10.2196/25113. J Med Internet Res. 2021. PMID: 33502324 Free PMC article.
-
A Large Language Model-Based Generative Natural Language Processing Framework Finetuned on Clinical Notes Accurately Extracts Headache Frequency from Electronic Health Records.medRxiv [Preprint]. 2023 Oct 3:2023.10.02.23296403. doi: 10.1101/2023.10.02.23296403. medRxiv. 2023. Update in: Headache. 2024 Apr;64(4):400-409. doi: 10.1111/head.14702. PMID: 37873417 Free PMC article. Updated. Preprint.
-
Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks: Algorithm Development and Validation Study.JMIR AI. 2023 May 4;2:e44293. doi: 10.2196/44293. JMIR AI. 2023. PMID: 38875537 Free PMC article.
-
A Natural Language Processing Model for COVID-19 Detection Based on Dutch General Practice Electronic Health Records by Using Bidirectional Encoder Representations From Transformers: Development and Validation Study.J Med Internet Res. 2023 Oct 4;25:e49944. doi: 10.2196/49944. J Med Internet Res. 2023. PMID: 37792444 Free PMC article.
Cited by
-
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391. JMIR Med Educ. 2024. PMID: 38349725 Free PMC article.
-
Integrating retrieval-augmented generation for enhanced personalized physician recommendations in web-based medical services: model development study.Front Public Health. 2025 Jan 29;13:1501408. doi: 10.3389/fpubh.2025.1501408. eCollection 2025. Front Public Health. 2025. PMID: 39944072 Free PMC article.
-
Retrieval augmented generation for large language models in healthcare: A systematic review.PLOS Digit Health. 2025 Jun 11;4(6):e0000877. doi: 10.1371/journal.pdig.0000877. eCollection 2025 Jun. PLOS Digit Health. 2025. PMID: 40498738 Free PMC article.
References
-
- Khoong EC, Nouri SS, Tuot DS, Nundy S, Fontil V, Sarkar U. Comparison of diagnostic recommendations from individual physicians versus the collective intelligence of multiple physicians in ambulatory cases referred for specialist consultation. Med Decis Making. 2022 Apr;42(3):293–302. doi: 10.1177/0272989X211031209. https://europepmc.org/abstract/MED/34378444 - DOI - PMC - PubMed
-
- Barnett ML, Boddupalli D, Nundy S, Bates DW. Comparative accuracy of diagnosis by collective intelligence of multiple physicians vs individual physicians. JAMA Netw Open. 2019 Mar 01;2(3):e190096. doi: 10.1001/jamanetworkopen.2019.0096. https://europepmc.org/abstract/MED/30821822 2726709 - DOI - PMC - PubMed
-
- Ramnarayan P, Tomlinson A, Rao A, Coren M, Winrow A, Britto J. ISABEL: a web-based differential diagnostic aid for paediatrics: results from an initial performance evaluation. Arch Dis Child. 2003 May;88(5):408–13. doi: 10.1136/adc.88.5.408. https://adc.bmj.com/lookup/pmidlookup?view=long&pmid=12716712 - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources