. 2024 Jun 19:12:e50209.

doi: 10.2196/50209.

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

Tassallah Abdullahi¹, Laura Mercurio², Ritambhara Singh^{1

3}, Carsten Eickhoff⁴

Affiliations

¹ Department of Computer Science, Brown University, Providence, RI, United States.
² Departments of Pediatrics & Emergency Medicine, Alpert Medical School, Brown University, Providence, RI, United States.
³ Center for Computational Molecular Biology, Brown University, Providence, RI, United States.
⁴ School of Medicine, University of Tübingen, Tübingen, Germany.

PMID: 38896468
PMCID: PMC11222760
DOI: 10.2196/50209

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

Tassallah Abdullahi et al. JMIR Med Inform. 2024.

. 2024 Jun 19:12:e50209.

doi: 10.2196/50209.

Authors

Tassallah Abdullahi¹, Laura Mercurio², Ritambhara Singh^{1

3}, Carsten Eickhoff⁴

Affiliations

¹ Department of Computer Science, Brown University, Providence, RI, United States.
² Departments of Pediatrics & Emergency Medicine, Alpert Medical School, Brown University, Providence, RI, United States.
³ Center for Computational Molecular Biology, Brown University, Providence, RI, United States.
⁴ School of Medicine, University of Tübingen, Tübingen, Germany.

PMID: 38896468
PMCID: PMC11222760
DOI: 10.2196/50209

Abstract

Background: Diagnostic errors pose significant health risks and contribute to patient mortality. With the growing accessibility of electronic health records, machine learning models offer a promising avenue for enhancing diagnosis quality. Current research has primarily focused on a limited set of diseases with ample training data, neglecting diagnostic scenarios with limited data availability.

Objective: This study aims to develop an information retrieval (IR)-based framework that accommodates data sparsity to facilitate broader diagnostic decision support.

Methods: We introduced an IR-based diagnostic decision support framework called CliniqIR. It uses clinical text records, the Unified Medical Language System Metathesaurus, and 33 million PubMed abstracts to classify a broad spectrum of diagnoses independent of training data availability. CliniqIR is designed to be compatible with any IR framework. Therefore, we implemented it using both dense and sparse retrieval approaches. We compared CliniqIR's performance to that of pretrained clinical transformer models such as Clinical Bidirectional Encoder Representations from Transformers (ClinicalBERT) in supervised and zero-shot settings. Subsequently, we combined the strength of supervised fine-tuned ClinicalBERT and CliniqIR to build an ensemble framework that delivers state-of-the-art diagnostic predictions.

Results: On a complex diagnosis data set (DC3) without any training data, CliniqIR models returned the correct diagnosis within their top 3 predictions. On the Medical Information Mart for Intensive Care III data set, CliniqIR models surpassed ClinicalBERT in predicting diagnoses with <5 training samples by an average difference in mean reciprocal rank of 0.10. In a zero-shot setting where models received no disease-specific training, CliniqIR still outperformed the pretrained transformer models with a greater mean reciprocal rank of at least 0.10. Furthermore, in most conditions, our ensemble framework surpassed the performance of its individual components, demonstrating its enhanced ability to make precise diagnostic predictions.

Conclusions: Our experiments highlight the importance of IR in leveraging unstructured knowledge resources to identify infrequently encountered diagnoses. In addition, our ensemble framework benefits from combining the complementary strengths of the supervised and retrieval-based models to diagnose a broad spectrum of diseases.

Keywords: EHR; RAG; clinical decision support; data sparsity; electronic health record; electronic health records; ensemble learning; information retrieval; machine learning; natural language processing; rare diseases; retrieval augmented generation; retrieval-augmented learning.

©Tassallah Abdullahi, Laura Mercurio, Ritambhara Singh, Carsten Eickhoff. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 19.06.2024.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
CliniqIR and Clinical Bidirectional Encoder Representations from Transformers (ClinicalBERT), classify patient notes and generate ranked lists of potential diagnoses. The reciprocal rank fusion (RRF) ensemble reranks the lists from both models to provide clinicians with a more accurate final ranking of differential diagnoses to aid the diagnostic process. MIMIC-III: Medical Information Mart for Intensive Care III; PMID: PubMed ID.

**Figure 2**
Overview of CliniqIR, the retrieval-based clinical decision support system. PMID: PubMed ID.

**Figure 3**
Outputs from the QuickUMLS tool developed by Soldani and Goharian [33] showing: (A) a graph of extracted concepts and their concept unique identifier (CUI) for a specific input text; the underlined texts are considered important words, and their corresponding Unified Medical Language System terms and CUIs are returned, (B) a query processing pipeline. Each text marked with a strike-through is filtered out to obtain a query.

**Figure 4**
Mean reciprocal rank (MRR) results for CliniqIR-based models and Clinical Bidirectional Encoder Representations from Transformers (ClinicalBERT) when predicting diagnoses with training sample sizes of 0, 1, 2, 3, 5, 6, and 7. Results indicate that the CliniqIR-based models perform best when the training sample size is between 0 and 5. However, ClinicalBERT performs best as training data size increases. “S” denotes that the ClinicalBERT model was used in a supervised setting.

**Figure 5**
Performance evaluation of CliniqIR models and each pretrained zero-shot baseline on the Medical Information Mart for Intensive Care III data set. We categorized the results by the frequency of note representative per diagnosis. “Z” represents models used in a zero-shot setting. The CliniqIR models performed best across data set categories in the low-resource regime. ClinicalBERT: Clinical Bidirectional Encoder Representations from Transformers; CODER: cross-lingual knowledge-infused medical term embeddin; MedCPT: Medical Contrastive Pre-trained Transformers; MRR: mean reciprocal rank; PubMedBERT: PubMed Bidirectional Encoder Representations from Transformers; SapBERT: Self-alignment Pretrained Bidirectional Encoder Representations from Transformers; SciBERT: Scientific Bidirectional Encoder Representations from Transformers.

**Figure 6**
Performance evaluation of the models on the Medical Information Mart for Intensive Care III data set before and after the ensemble. Adopting the reciprocal rank fusion (RRF) algorithm as an ensemble strategy boosted predictive performance across the data set. The Clinical Bidirectional Encoder Representations from Transformer (ClinicalBERT) model cannot directly make predictions for diagnoses with no training samples. Hence, we used “*” to mark such data set categories. The letter “S” denotes that ClinicalBERT was used as a supervised model. MedCPT: Medical Contrastive Pre-trained Transformers; MRR: mean reciprocal rank.

See this image and copyright information in PMC

References

1. Barnett GO, Cimino JJ, Hupp JA, Hoffer EP. DXplain: an evolving diagnostic decision-support system. JAMA. 1987 Jul 03;258(1):67–74. doi: 10.1001/jama.1987.03400010071030. - DOI - PubMed
1. Khoong EC, Nouri SS, Tuot DS, Nundy S, Fontil V, Sarkar U. Comparison of diagnostic recommendations from individual physicians versus the collective intelligence of multiple physicians in ambulatory cases referred for specialist consultation. Med Decis Making. 2022 Apr;42(3):293–302. doi: 10.1177/0272989X211031209. https://europepmc.org/abstract/MED/34378444 - DOI - PMC - PubMed
1. Barnett ML, Boddupalli D, Nundy S, Bates DW. Comparative accuracy of diagnosis by collective intelligence of multiple physicians vs individual physicians. JAMA Netw Open. 2019 Mar 01;2(3):e190096. doi: 10.1001/jamanetworkopen.2019.0096. https://europepmc.org/abstract/MED/30821822 2726709 - DOI - PMC - PubMed
1. Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020 Feb 06;3:17. doi: 10.1038/s41746-020-0221-y. doi: 10.1038/s41746-020-0221-y.221 - DOI - PMC - PubMed
1. Ramnarayan P, Tomlinson A, Rao A, Coren M, Winrow A, Britto J. ISABEL: a web-based differential diagnostic aid for paediatrics: results from an initial performance evaluation. Arch Dis Child. 2003 May;88(5):408–13. doi: 10.1136/adc.88.5.408. https://adc.bmj.com/lookup/pmidlookup?view=long&pmid=12716712 - DOI - PMC - PubMed

Grants and funding

T32 DA013911/DA/NIDA NIH HHS/United States

LinkOut - more resources

Full Text Sources
- JMIR Publications
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

Affiliations

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources