Real-world validation of a multimodal LLM-powered pipeline for high-accuracy clinical trial patient matching
- PMID: 41275042
- DOI: 10.1038/s43856-025-01256-0
Real-world validation of a multimodal LLM-powered pipeline for high-accuracy clinical trial patient matching
Abstract
Background: Patient recruitment in clinical trials is hindered by complex eligibility criteria and labor-intensive chart reviews. Prior research using text-only models has struggled to address this problem in a reliable and scalable way due to (1) limited reasoning capabilities, (2) information loss from converting visual records to text, and (3) lack of a generic EHR integration to extract patient data.
Methods: We introduce a broadly applicable, integration-free, LLM-powered pipeline that automates patient-trial matching using unprocessed documents extracted from EHRs. Our approach leverages (1) the new reasoning-LLM paradigm, enabling the assessment of even the most complex criteria, (2) the visual capabilities of the latest LLMs to interpret medical records without lossy image-to-text conversions, and (3) multimodal embeddings for efficient medical record search. The pipeline was validated on the n2c2 2018 cohort selection dataset (288 diabetic patients) and a real-world dataset composed of 485 patients from 30 different sites matched against 36 diverse trials.
Results: On the n2c2 dataset, our method introduces a new state-of-the-art criterion-level accuracy of 93%. In real-world trials, the pipeline yielded an accuracy of 87%, undermined by the difficulty of replicating human decision-making when medical records lack sufficient information. Nevertheless, users were able to review overall eligibility in under 9 minutes per patient on average, representing an 80% improvement over traditional manual chart reviews.
Conclusions: This pipeline demonstrates robust performance in clinical trial patient matching without requiring custom integration with site systems or trial-specific tailoring, thereby enabling scalable deployment across sites seeking to leverage AI for patient matching.
Plain language summary
Recruiting patients for clinical trials is time-consuming and resource-intensive because eligibility rules are complex and medical records are lengthy. We built an artificial intelligence (AI) system that helps match patients to trials by reading both text and images from medical records, including scans, tables, and handwriting. This digital tool finds the most relevant pages, checks each rule step by step, and clearly flags when medical information is missing. We evaluated it on a widely used public dataset and in real clinics across many sites and trials. The system produced reliable, high-quality eligibility assessments, and coordinators were able to review each patient in under nine minutes on average, much faster than manual chart review. Because it works without custom connections to hospital software, it can be deployed broadly to reduce delays and help more patients access studies and new treatments.
© 2025. The Author(s).
Conflict of interest statement
Competing interests: The authors are employees or administrators of Inato, a company that develops products evaluated in this study.
References
-
- Shah, H. S., Chaturvedi, K., Kuang, S. & Wang, J. Accelerating pre-formulation investigations in early drug product life cycles using predictive methodologies and computational algorithms. Therapeutic Deliv. 12, 789–797 (2021).
-
- Penberthy, L. T., Dahman, B. A., Petkov, V. I. & DeShazo, J. P. Effort required in eligibility screening for clinical trials. J. Oncol. Pract. 8, 365–370 (2012).
-
- Wong, C. et al. Scaling clinical trial matching using large language models: a case study in oncology. Preprint at https://arxiv.org/abs/2308.02180 (2023).
-
- den Hamer, D. M., Schoor, P., Polak, T. B. & Kapitan, D. Improving patient pre-screening for clinical trials: assisting physicians with large language models. Preprint at https://arxiv.org/abs/2304.07396 (2023).
-
- Jin, Q. et al. Matching patients to clinical trials with large language models. Nat. Commun. 15 https://doi.org/10.1038/s41467-024-53081-z (2024).
LinkOut - more resources
Full Text Sources