Synthetic data in health care: A narrative review
- PMID: 36812604
- PMCID: PMC9931305
- DOI: 10.1371/journal.pdig.0000082
Synthetic data in health care: A narrative review
Abstract
Data are central to research, public health, and in developing health information technology (IT) systems. Nevertheless, access to most data in health care is tightly controlled, which may limit innovation, development, and efficient implementation of new research, products, services, or systems. Using synthetic data is one of the many innovative ways that can allow organizations to share datasets with broader users. However, only a limited set of literature is available that explores its potentials and applications in health care. In this review paper, we examined existing literature to bridge the gap and highlight the utility of synthetic data in health care. We searched PubMed, Scopus, and Google Scholar to identify peer-reviewed articles, conference papers, reports, and thesis/dissertations articles related to the generation and use of synthetic datasets in health care. The review identified seven use cases of synthetic data in health care: a) simulation and prediction research, b) hypothesis, methods, and algorithm testing, c) epidemiology/public health research, d) health IT development, e) education and training, f) public release of datasets, and g) linking data. The review also identified readily and publicly accessible health care datasets, databases, and sandboxes containing synthetic data with varying degrees of utility for research, education, and software development. The review provided evidence that synthetic data are helpful in different aspects of health care and research. While the original real data remains the preferred choice, synthetic data hold possibilities in bridging data access gaps in research and evidence-based policymaking.
Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Conflict of interest statement
The authors have declared that no competing interests exist.
References
-
- Summary of the HIPAA privacy rule 2003 [cited 22 September 2019]. In: HHS.gov [Internet]. Available from: https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/ind....
-
- Levenstein M, Tyler A, Bleckman J. The Researcher Passport: Improving Data Access and Confidentiality Protection. 2018. May 1 [cited 22 September 2019]. Available from: https://www.icpsr.umich.edu/files/about/researcher/ICPSR_ResearcherCrede....
LinkOut - more resources
Full Text Sources
Miscellaneous