Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Dec;4(12):e893-e898.
doi: 10.1016/S2589-7500(22)00154-6. Epub 2022 Sep 22.

Leveraging electronic health records for data science: common pitfalls and how to avoid them

Affiliations
Free article
Review

Leveraging electronic health records for data science: common pitfalls and how to avoid them

Christopher M Sauer et al. Lancet Digit Health. 2022 Dec.
Free article

Abstract

Analysis of electronic health records (EHRs) is an increasingly common approach for studying real-world patient data. Use of routinely collected data offers several advantages compared with other study designs, including reduced administrative costs, the ability to update analysis as practice patterns evolve, and larger sample sizes. Methodologically, EHR analysis is subject to distinct challenges because data are not collected for research purposes. In this Viewpoint, we elaborate on the importance of in-depth knowledge of clinical workflows and describe six potential pitfalls to be avoided when working with EHR data, drawing on examples from the literature and our experience. We propose solutions for prevention or mitigation of factors associated with each of these six pitfalls-sample selection bias, imprecise variable definitions, limitations to deployment, variable measurement frequency, subjective treatment allocation, and model overfitting. Ultimately, we hope that this Viewpoint will guide researchers to further improve the methodological robustness of EHR analysis.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests SLH is an employee of Microsoft Research (UK) and a board member of the non-profit organisation Association for Health Learning and Inference. All other authors declare no competing interests.

Publication types

LinkOut - more resources