Harnessing the power of synthetic data in healthcare: innovation, application, and privacy
- PMID: 37813960
- PMCID: PMC10562365
- DOI: 10.1038/s41746-023-00927-3
Harnessing the power of synthetic data in healthcare: innovation, application, and privacy
Abstract
Data-driven decision-making in modern healthcare underpins innovation and predictive analytics in public health and clinical research. Synthetic data has shown promise in finance and economics to improve risk assessment, portfolio optimization, and algorithmic trading. However, higher stakes, potential liabilities, and healthcare practitioner distrust make clinical use of synthetic data difficult. This paper explores the potential benefits and limitations of synthetic data in the healthcare analytics context. We begin with real-world healthcare applications of synthetic data that informs government policy, enhance data privacy, and augment datasets for predictive analytics. We then preview future applications of synthetic data in the emergent field of digital twin technology. We explore the issues of data quality and data bias in synthetic data, which can limit applicability across different applications in the clinical context, and privacy concerns stemming from data misuse and risk of re-identification. Finally, we evaluate the role of regulatory agencies in promoting transparency and accountability and propose strategies for risk mitigation such as Differential Privacy (DP) and a dataset chain of custody to maintain data integrity, traceability, and accountability. Synthetic data can improve healthcare, but measures to protect patient well-being and maintain ethical standards are key to promote responsible use.
© 2023. Springer Nature Limited.
Conflict of interest statement
The authors declare no competing interests.
Figures

Similar articles
-
Synthetic data in medicine: Legal and ethical considerations for patient profiling.Comput Struct Biotechnol J. 2025 May 29;28:190-198. doi: 10.1016/j.csbj.2025.05.026. eCollection 2025. Comput Struct Biotechnol J. 2025. PMID: 40520252 Free PMC article.
-
Unraveling the Ethical Enigma: Artificial Intelligence in Healthcare.Cureus. 2023 Aug 10;15(8):e43262. doi: 10.7759/cureus.43262. eCollection 2023 Aug. Cureus. 2023. PMID: 37692617 Free PMC article. Review.
-
Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study.JMIR Mhealth Uhealth. 2021 Jul 28;9(7):e27343. doi: 10.2196/27343. JMIR Mhealth Uhealth. 2021. PMID: 34319252 Free PMC article.
-
The project data sphere initiative: accelerating cancer research by sharing data.Oncologist. 2015 May;20(5):464-e20. doi: 10.1634/theoncologist.2014-0431. Epub 2015 Apr 15. Oncologist. 2015. PMID: 25876994 Free PMC article.
-
Can I trust my fake data - A comprehensive quality assessment framework for synthetic tabular data in healthcare.Int J Med Inform. 2024 May;185:105413. doi: 10.1016/j.ijmedinf.2024.105413. Epub 2024 Mar 12. Int J Med Inform. 2024. PMID: 38493547 Review.
Cited by
-
Using UMAP for Partially Synthetic Healthcare Tabular Data Generation and Validation.Sensors (Basel). 2024 Dec 8;24(23):7843. doi: 10.3390/s24237843. Sensors (Basel). 2024. PMID: 39686380 Free PMC article.
-
Generating unseen diseases patient data using ontology enhanced generative adversarial networks.NPJ Digit Med. 2025 Jan 3;8(1):4. doi: 10.1038/s41746-024-01421-0. NPJ Digit Med. 2025. PMID: 39753917 Free PMC article.
-
Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples.Sci Data. 2024 Dec 5;11(1):1330. doi: 10.1038/s41597-024-04157-4. Sci Data. 2024. PMID: 39638794 Free PMC article.
-
Addressing 6 challenges in generative AI for digital health: A scoping review.PLOS Digit Health. 2024 May 23;3(5):e0000503. doi: 10.1371/journal.pdig.0000503. eCollection 2024 May. PLOS Digit Health. 2024. PMID: 38781686 Free PMC article.
-
Evaluating GPT models for clinical note de-identification.Sci Rep. 2025 Jan 31;15(1):3852. doi: 10.1038/s41598-025-86890-3. Sci Rep. 2025. PMID: 39890969 Free PMC article.
References
-
- Assefa, S. Generating Synthetic Data in Finance: Opportunities, Challenges and Pitfalls. Available at SSRN: https://ssrn.com/abstract=3634235. (2020).
-
- McDuff, D., Curran T. & Kadambi, A. Synthetic Data in Healthcare. arXiv preprint arXiv:2304.03243 (2023).
-
- Jordon J. et al. Weller Adrian. Synthetic Data – what, why and how? arXiv: 2205.03257 [cs], (2022).
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources