Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing
- PMID: 40502247
- PMCID: PMC12150723
Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing
Abstract
The publication and sharing of clinical notes are crucial for healthcare research and innovation. However, privacy regulations such as HIPAA and GDPR pose significant challenges. While de-identification techniques aim to remove protected health information, they often fall short of achieving complete privacy protection. Similarly, the current state of synthetic clinical note generation can lack nuance and content coverage. To address these limitations, we propose an approach that combines de-identification, filtration, and synthetic clinical note generation. Variations of this approach currently retain 36%-61% of the original note's content and fill the remaining gaps using an LLM, ensuring high information coverage. We also evaluated the de-identification performance of the hybrid notes, demonstrating that they surpass or at least match the standalone de-identification methods. Our results show that hybrid notes can maintain patient privacy while preserving the richness of clinical data. This approach offers a promising solution for safe and effective data sharing, encouraging further research.
©2025 AMIA - All rights reserved.
Figures
Similar articles
-
The Black Book of Psychotropic Dosing and Monitoring.Psychopharmacol Bull. 2024 Jul 8;54(3):8-59. Psychopharmacol Bull. 2024. PMID: 38993656 Free PMC article. Review.
-
A Spectrum of Understanding: A Qualitative Exploration of Autistic Adults' Understandings and Perceptions of Friendship(s).Autism Adulthood. 2024 Dec 2;6(4):438-450. doi: 10.1089/aut.2023.0051. eCollection 2024 Dec. Autism Adulthood. 2024. PMID: 40018059
-
Consequences, costs and cost-effectiveness of workforce configurations in English acute hospitals.Health Soc Care Deliv Res. 2025 Jul;13(25):1-107. doi: 10.3310/ZBAR9152. Health Soc Care Deliv Res. 2025. PMID: 40622683
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
The health economics of insulin therapy: How do we address the rising demands, costs, inequalities and barriers to achieving optimal outcomes.Diabetes Obes Metab. 2025 Jul;27 Suppl 5(Suppl 5):24-35. doi: 10.1111/dom.16488. Epub 2025 Jun 4. Diabetes Obes Metab. 2025. PMID: 40464081 Free PMC article.
References
-
- Van Aken B, Papaioannou JM, Mayrdorfer M, Budde K, Gers FA, Loeser A. Clinical outcome prediction from admission notes using self-supervised knowledge integration. arXiv preprint arXiv:2102.04110. 2021 Feb 8
-
- Johnson A, Pollard T, Mark R. MIMIC-III clinical database (version 1.4) PhysioNet. 2016;10(C2XW26):2.
-
- Ness RB. Joint Policy Committee. Influence of the HIPAA privacy rule on health research. Jama. 2007 Nov 14;298(18):2164–70. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources