Comparing artificial intelligence- vs clinician-authored summaries of simulated primary care electronic health records
- PMID: 40741008
- PMCID: PMC12309840
- DOI: 10.1093/jamiaopen/ooaf082
Comparing artificial intelligence- vs clinician-authored summaries of simulated primary care electronic health records
Abstract
Objective: To compare clinical summaries generated from simulated patient primary care electronic health records (EHRs) by GPT-4, to summaries generated by clinicians on multiple domains of quality including utility, concision, accuracy, and bias.
Materials and methods: Seven primary care physicians generated 70 simulated patient EHR notes, each representing 10 patient contacts with the practice over at least 2 years. Each record was summarized by a different clinician and by GPT-4. artificial intelligence (AI)- and clinician-authored summaries were rated blind by clinicians according to 8 domains of quality and an overall rating.
Results: The median time taken for a clinician to read through and assimilate the information in the EHRs before summarizing, was 7 minutes. Clinicians rated clinician-authored summaries higher than AI-authored summaries overall (7.39 vs 7.00 out of 10; P = .02), but with greater variability in clinician-authored summary ratings. AI and clinician-authored summaries had similar accuracy and AI-authored summaries were less likely to omit important information and more likely to use patient-friendly language.
Discussion: Although AI-authored summaries were rated slightly lower overall compared with clinician-authored summaries, they demonstrated similar accuracy and greater consistency. This demonstrates potential applications for generating summaries in primary care, particularly given the substantial time taken for clinicians to undertake this work.
Conclusion: The results suggest the feasibility, utility and acceptability of using AI-authored summaries to integrate into EHRs to support clinicians in primary care. AI summarization tools have the potential to improve healthcare productivity, including by enabling clinicians to spend more time on direct patient care.
Keywords: electronic health records; generative AI; health informatics; large language models; primary care.
© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Conflict of interest statement
The authors have no competing interests to declare.
Figures





Similar articles
-
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3. Syst Rev. 2024. PMID: 39593159 Free PMC article.
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
Evaluating Large Language Models for Drafting Emergency Department Discharge Summaries.medRxiv [Preprint]. 2024 Apr 4:2024.04.03.24305088. doi: 10.1101/2024.04.03.24305088. medRxiv. 2024. Update in: PLOS Digit Health. 2025 Jun 17;4(6):e0000899. doi: 10.1371/journal.pdig.0000899. PMID: 38633805 Free PMC article. Updated. Preprint.
-
The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2. Lancet Digit Health. 2025. PMID: 39722251 Free PMC article.
-
Artificial Intelligence to Improve Clinical Coding Practice in Scandinavia: Crossover Randomized Controlled Trial.J Med Internet Res. 2025 Jul 3;27:e71904. doi: 10.2196/71904. J Med Internet Res. 2025. PMID: 40608484 Free PMC article. Clinical Trial.
References
LinkOut - more resources
Full Text Sources