A General Primer for Data Harmonization
- PMID: 38297013
- PMCID: PMC10831085
- DOI: 10.1038/s41597-024-02956-3
A General Primer for Data Harmonization
Abstract
Data harmonization is an important method for combining or transforming data. To date however, articles about data harmonization are field-specific and highly technical, making it difficult for researchers to derive general principles for how to engage in and contextualize data harmonization efforts. This commentary provides a primer on the tradeoffs inherent in data harmonization for researchers who are considering undertaking such efforts or seek to evaluate the quality of existing ones. We derive this guidance from the extant literature and our own experience in harmonizing data for the emergent and important new field of COVID-19 public health and safety measures (PHSM).
Conflict of interest statement
The authors declare no competing interests.
Figures
References
-
- Demchenko, Y., Zhao, Z., Grosso, P., Wibisono, A. & De Laat, C. Addressing big data challenges for scientific data infrastructure. In 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, 614–617, 10.1109/CloudCom.2012.6427494 (IEEE, 2012).
-
- Ruggles, S. The minnesota population center data integration projects: Challenges of harmonizing census microdata across time and place. In In Proceedings of the American Statistical Association, Government Statistics Section, 1405–1415 (Citeseer, 2006).
-
- Elshawi R, Sakr S, Talia D, Trunfio P. Big data systems meet machine learning challenges: towards big data science as a service. Big data research. 2018;14:1–11. doi: 10.1016/j.bdr.2018.04.004. - DOI
-
- Solt F. The standardized world income inequality database. Social science quarterly. 2016;97:1267–1281. doi: 10.1111/ssqu.12295. - DOI
-
- Solt F. 2009. The standardized world income inequality database v1-v7”. Harvard Dataverse, V20. - DOI
Grants and funding
- 101016233/EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
- 101016233/EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
- 101016233/EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
- 101016233/EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
- 832-06g/National Council for Eurasian and East European Research (NCEEER)
LinkOut - more resources
Full Text Sources
