Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 10:2025:441-450.
eCollection 2025.

Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing

Affiliations

Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing

Atiquer Rahman Sarkar et al. AMIA Jt Summits Transl Sci Proc. .

Abstract

The publication and sharing of clinical notes are crucial for healthcare research and innovation. However, privacy regulations such as HIPAA and GDPR pose significant challenges. While de-identification techniques aim to remove protected health information, they often fall short of achieving complete privacy protection. Similarly, the current state of synthetic clinical note generation can lack nuance and content coverage. To address these limitations, we propose an approach that combines de-identification, filtration, and synthetic clinical note generation. Variations of this approach currently retain 36%-61% of the original note's content and fill the remaining gaps using an LLM, ensuring high information coverage. We also evaluated the de-identification performance of the hybrid notes, demonstrating that they surpass or at least match the standalone de-identification methods. Our results show that hybrid notes can maintain patient privacy while preserving the richness of clinical data. This approach offers a promising solution for safe and effective data sharing, encouraging further research.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The idea behind hybrid note generation.
Figure 2.
Figure 2.
Algorithm for Hybrid Note Generation.

Similar articles

References

    1. Van Aken B, Papaioannou JM, Mayrdorfer M, Budde K, Gers FA, Loeser A. Clinical outcome prediction from admission notes using self-supervised knowledge integration. arXiv preprint arXiv:2102.04110. 2021 Feb 8
    1. Ye J, Yao L, Shen J, Janarthanam R, Luo Y. Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes. BMC medical informatics and decision making. 2020 Dec;20:1–7. - PMC - PubMed
    1. Johnson A, Pollard T, Mark R. MIMIC-III clinical database (version 1.4) PhysioNet. 2016;10(C2XW26):2.
    1. Ness RB. Joint Policy Committee. Influence of the HIPAA privacy rule on health research. Jama. 2007 Nov 14;298(18):2164–70. - PubMed
    1. Forcier MB, Gallois H, Mullan S, Joly Y. Integrating artificial intelligence into health care through data access: can the GDPR act as a beacon for policymakers? Journal of Law and the Biosciences. 2019 Oct;6(1):317–35. - PMC - PubMed

LinkOut - more resources