Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 31;11(1):152.
doi: 10.1038/s41597-024-02956-3.

A General Primer for Data Harmonization

Affiliations

A General Primer for Data Harmonization

Cindy Cheng et al. Sci Data. .

Abstract

Data harmonization is an important method for combining or transforming data. To date however, articles about data harmonization are field-specific and highly technical, making it difficult for researchers to derive general principles for how to engage in and contextualize data harmonization efforts. This commentary provides a primer on the tradeoffs inherent in data harmonization for researchers who are considering undertaking such efforts or seek to evaluate the quality of existing ones. We derive this guidance from the extant literature and our own experience in harmonizing data for the emergent and important new field of COVID-19 public health and safety measures (PHSM).

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
General Steps for Data Harmonization.

Similar articles

Cited by

References

    1. Demchenko, Y., Zhao, Z., Grosso, P., Wibisono, A. & De Laat, C. Addressing big data challenges for scientific data infrastructure. In 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, 614–617, 10.1109/CloudCom.2012.6427494 (IEEE, 2012).
    1. Ruggles, S. The minnesota population center data integration projects: Challenges of harmonizing census microdata across time and place. In In Proceedings of the American Statistical Association, Government Statistics Section, 1405–1415 (Citeseer, 2006).
    1. Elshawi R, Sakr S, Talia D, Trunfio P. Big data systems meet machine learning challenges: towards big data science as a service. Big data research. 2018;14:1–11. doi: 10.1016/j.bdr.2018.04.004. - DOI
    1. Solt F. The standardized world income inequality database. Social science quarterly. 2016;97:1267–1281. doi: 10.1111/ssqu.12295. - DOI
    1. Solt F. 2009. The standardized world income inequality database v1-v7”. Harvard Dataverse, V20. - DOI