Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Nov 28;8(1):732.
doi: 10.1038/s41746-025-02112-0.

Protecting patient privacy in tabular synthetic health data: a regulatory perspective

Affiliations
Review

Protecting patient privacy in tabular synthetic health data: a regulatory perspective

Lisa Pilgram et al. NPJ Digit Med. .

Abstract

Synthetic tabular data generation (SDG) is increasingly important in healthcare research and innovation while preserving patients' privacy. However, ethical concerns remain, primarily over residual privacy vulnerability and insufficient oversight. This review analyzes the only published SDG regulatory guidelines to date, from United Kingdom, Singapore, and South Korea. All emphasize privacy, acknowledging synthetic data is not inherently free from disclosure risks. Thresholds for sufficiently low risk are yet to be determined.

PubMed Disclaimer

Conflict of interest statement

Competing interests: At the time of writing KEE had shares in Aetion, which acquired his university spin-off company that develops SDG software. KEE was also the Scholar-in-Residence at the Office of the Information and Privacy Commissioner of Ontario at the time of writing.

Figures

Fig. 1
Fig. 1. Tabular Synthetic Data Generation for Healthcare.
There are multiple generative models that can be leveraged including AI/ML.

References

    1. Walonoski, J. et al. Synthea: an approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record. J. Am. Med Inf. Assoc.25, 230–238 (2018). - DOI - PMC - PubMed
    1. Jeanson, F. et al. Medical calculators derived synthetic cohorts: a novel method for generating synthetic patient data. Sci. Rep.14, 11437 (2024). - DOI - PMC - PubMed
    1. Al-Dhamari, I., Abu Attieh, H. & Prasser, F. Synthetic datasets for open software development in rare disease research. Orphanet J. Rare Dis.19, 265 (2024). - DOI - PMC - PubMed
    1. Templ, M., Meindl, B., Kowarik, A. & Dupriez, O. Simulation of synthetic complex data: the r package simPop. J. Stat. Softw.79, 1–38 (2017). - DOI
    1. Rineer, J. et al. A national synthetic populations dataset for the United States. Sci. Data12, 144 (2025). - DOI - PMC - PubMed

LinkOut - more resources