Performance of large language models in interventional cardiology: the ILLUMINATE blinded model-comparison study

Affiliations

¹ Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy; Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy; Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy; Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy; Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy; Maria Cecilia Hospital, GVM Care and Research, Cotignola, Italy; ICOT Marco Pasquali Institute, Cardiovascular Department Latina; Department of Clinical and Molecular Medicine, Sapienza University of Rome, Rome, Italy. Email: attilio.lauretti@uniroma1.it.
² Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy.
³ Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy.
⁴ Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy.
⁵ Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy.
⁶ Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy; Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy; Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy; Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy; Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy.
⁷ Department of Clinical and Molecular Medicine, Sapienza University of Rome, Rome, Italy.
⁸ Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy; Maria Cecilia Hospital, GVM Care and Research, Cotignola, Italy.

PMID: 41288474
DOI: 10.25270/jic/25.00104

Free article

Performance of large language models in interventional cardiology: the ILLUMINATE blinded model-comparison study

Attilio Lauretti et al. J Invasive Cardiol. 2025.

Free article

. 2025 Nov 21.

doi: 10.25270/jic/25.00104. Online ahead of print.

Affiliations

¹ Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy; Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy; Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy; Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy; Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy; Maria Cecilia Hospital, GVM Care and Research, Cotignola, Italy; ICOT Marco Pasquali Institute, Cardiovascular Department Latina; Department of Clinical and Molecular Medicine, Sapienza University of Rome, Rome, Italy. Email: attilio.lauretti@uniroma1.it.
² Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy.
³ Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy.
⁴ Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy.
⁵ Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy.
⁶ Division of Cardiology, Santa Maria Goretti Hospital, Latina, Italy; Cardiology Unit, Department of Emergency and Admission, San Paolo Hospital, Civitavecchia, Italy; Department of Cardiovascular Sciences, Fondazione Policlinico Agostino Gemelli IRCCS, Rome, Italy; Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Division of Cardiology, Department of Medical Sciences, University of Turin, Italy; Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy.
⁷ Department of Clinical and Molecular Medicine, Sapienza University of Rome, Rome, Italy.
⁸ Department of Medical-Surgical Sciences and Biotechnologies, Sapienza University of Rome, Latina, Italy; Maria Cecilia Hospital, GVM Care and Research, Cotignola, Italy.

PMID: 41288474
DOI: 10.25270/jic/25.00104

Abstract

Objectives: Large language models (LLMs) have the potential to assist in complex decision making for interventional cardiology (IC). However, their comparative performance in providing clinical recommendations remains uncertain. In this blinded model‑comparison study, the authors evaluated and compared the quality of recommendations produced by 6 LLMs for complex IC cases.

Methods: Twenty detailed and complex clinical cases focusing on coronary artery disease (n=10) and structural heart disease (n=10) were developed. Six LLMs were tested: default ChatGPT (ChatGPTd), ChatGPT with European Society of Cardiology guidelines (ChatGPT-gl), ChatGPT with internet search enabled (ChatGPTi), Gemini (Google), Mistral 7B (Mistral AI), and Perplexity AI (Perplexity AI, Inc.). Only the ordering of anonymized outputs was randomized to ensure blinding. Five expert ICs independently assessed the anonymized and randomized responses using a 0 to 10 scale for appropriateness, accuracy, relevance, clarity, and clinical utility, generating a composite score. Statistical analysis was performed using a mixed linear model.

Results: Six hundred blinded evaluations (20 cases x 6 models x 5 raters) were analyzed, yielding an overall composite score of 7.1 (95% CI, 7.0-7.2). Performance significantly varied across LLMs (P less than .001), with ChatGPTi (7.8 [7.5-8.0]) and ChatGPT-gl (7.7 [7.4-7.9]) outperforming others. ChatGPTd (6.9 [6.6-7.3]), Mistral 7B (7.0 [6.7-7.3]), and Perplexity AI (7.0 [6.7-7.3]) performed moderately, while Gemini had the lowest score (6.3 [6.0-6.7]). These differences were consistent across all scoring dimensions (P less than .001). Case type did not affect LLM performance (P = .900).

Conclusions: LLMs show promise in IC decision making, but their performance remains suboptimal. Maximizing their potential requires systematic integration of web search capabilities and guideline-based knowledge retrieval.

Keywords: artificial intelligence; coronary artery disease; large language models; percutaneous coronary intervention.

PubMed Disclaimer

LinkOut - more resources

Full Text Sources
- HMP Communications, LLC

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Performance of large language models in interventional cardiology: the ILLUMINATE blinded model-comparison study

Affiliations

Performance of large language models in interventional cardiology: the ILLUMINATE blinded model-comparison study

Authors

Affiliations

Abstract

LinkOut - more resources

Full Text Sources