Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record
- PMID: 29025144
- PMCID: PMC7651916
- DOI: 10.1093/jamia/ocx079
Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record
Erratum in
-
Erratum to: Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record.J Am Med Inform Assoc. 2018 Jul 1;25(7):921. doi: 10.1093/jamia/ocx147. J Am Med Inform Assoc. 2018. PMID: 29253166 Free PMC article. No abstract available.
Abstract
Objective: Our objective is to create a source of synthetic electronic health records that is readily available; suited to industrial, innovation, research, and educational uses; and free of legal, privacy, security, and intellectual property restrictions.
Materials and methods: We developed Synthea, an open-source software package that simulates the lifespans of synthetic patients, modeling the 10 most frequent reasons for primary care encounters and the 10 chronic conditions with the highest morbidity in the United States.
Results: Synthea adheres to a previously developed conceptual framework, scales via open-source deployment on the Internet, and may be extended with additional disease and treatment modules developed by its user community. One million synthetic patient records are now freely available online, encoded in standard formats (eg, Health Level-7 [HL7] Fast Healthcare Interoperability Resources [FHIR] and Consolidated-Clinical Document Architecture), and accessible through an HL7 FHIR application program interface.
Discussion: Health care lags other industries in information technology, data exchange, and interoperability. The lack of freely distributable health records has long hindered innovation in health care. Approaches and tools are available to inexpensively generate synthetic health records at scale without accidental disclosure risk, lowering current barriers to entry for promising early-stage developments. By engaging a growing community of users, the synthetic data generated will become increasingly comprehensive, detailed, and realistic over time.
Conclusion: Synthetic patients can be simulated with models of disease progression and corresponding standards of care to produce risk-free realistic synthetic health care records at scale.
Keywords: RS-EHR; clinical pathways; computer simulation; electronic health records; patient-specific modeling.
© The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Figures
References
-
- Moniz L., Buczak A. L., Hung L., Babin S., Dorko M., Lombardo J.. Construction and Validation of Synthetic Electronic Medical Records. Online J Public Health Inform. 2009;11: ojphi.v1i1.2720. http://doi.org/10.5210/ojphi.v1i1.2720. - PMC - PubMed
-
- Vinzamuri B, Reddy C. Cox Regression with Correlation Based Regularization for Electronic Health Records. Wayne State University; 2013. http://dmkd.cs.vt.edu/papers/ICDM13.pdf
-
- Weiss J, Page D. Forest-based point process for event prediction from electronic health records.European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases . University of Wisconsin; 2013. http://www.ecmlpkdd2013.org/wp-content/uploads/2013/07/128.pdf
-
- Braunstein M. From EHR to Healthcare App Platform. Information Week: Healthcare. 2014. http://www.informationweek.com/healthcare/electronic-health-records/from.... Accessed July 25, 2017.
-
- Sweeney L, Abu A, Winn J. Identifying Participants in the Personal Genome Project by Name. Harvard University: Data Privacy Lab; 2013. http://dataprivacylab.org/projects/pgp/1021-1.pdf. Accessed July 25, 2017.
LinkOut - more resources
Full Text Sources
Other Literature Sources
