Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 20;21(1):289.
doi: 10.1186/s12911-021-01643-2.

A statistical quality assessment method for longitudinal observations in electronic health record data with an application to the VA million veteran program

Affiliations

A statistical quality assessment method for longitudinal observations in electronic health record data with an application to the VA million veteran program

Hui Wang et al. BMC Med Inform Decis Mak. .

Abstract

Background: To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use.

Methods: The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR data for detection of the observations inconsistent with the history of a patient. The method was applied to the height and weight data in the EHR from the Million Veteran Program Data from the Veteran's Healthcare Administration (VHA). A validation study was also conducted.

Results: The receiver operating characteristic (ROC) metrics of the developed method outperforms the widely used thresholding method. It is also demonstrated that different quality assessment methods have a non-ignorable impact on the body mass index (BMI) classification calculated from height and weight data in the VHA's database.

Conclusions: The score-based method enables automated and scaled detection of the problematic data points in health care big data while allowing the investigators to select the high-quality data based on their need. Leveraging the longitudinal characteristics in EHR will significantly improve the QA performance.

Keywords: Clinical informatics; Data quality assessment (DQA); Electronic health record (EHR); Health care big data; Real world evidence; Vital signs.

PubMed Disclaimer

Conflict of interest statement

There is no competing interest for this study.

Figures

Fig. 1
Fig. 1
Histograms of height measurements (inch) stratified by the agreements between QR and QS. a Height measurements where both QR and QS > 0.05 cutoff value; b Height measurements where QR ≤ 0.05 and QS > 0.05; c Height measurements where QR > 0.05 and QS ≤ 0.05; d Height measurements where both QR and QS ≤ 0.05 cutoff value. X-axis represent the values of height, and y-axis is frequency counts
Fig. 2
Fig. 2
Histograms of weight measurements (lb) stratified by the agreements between QR and QS. a Weight measurements where both QR and QS ≥ 0.05 cutoff value; b Weight measurements where QR ≤ 0.05 and QS ≥ 0.05; c Weight measurements where QR ≥ 0.05 and QS ≤ 0.05; d Weight measurements where both QR and QS ≤ 0.05 cutoff value. X-axis represent the values of height, and y-axis is frequency counts

References

    1. Mathur R, Bhaskaran K, Edwards E, et al. Population trends in the 10-year incidence and prevalence of diabetic retinopathy in the UK: a cohort study in the Clinical Practice Research Datalink 2004–2014. BMJ Open 2017;7(2):e014444. - PMC - PubMed
    1. Liu M, Hinz ERM, Matheny ME, et al. Comparative analysis of pharmacovigilance methods in the detection of adverse drug reactions using electronic medical records. J Am Med Inform Assoc. 2013;20(3):420–426. doi: 10.1136/amiajnl-2012-001119. - DOI - PMC - PubMed
    1. Moore TJ, Furberg CD. Electronic health data for postmarket surveillance: a vision not realized. Drug Saf. 2015;38(7):601–10. doi: 10.1007/s40264-015-0305-9. - DOI - PubMed
    1. Stapff M, Hilderbrand S. First-line treatment of essential hypertension: a real-world analysis across four antihypertensive treatment classes. J Clin Hypertens. 2019;21(5):627–634. doi: 10.1111/jch.13531. - DOI - PMC - PubMed
    1. Kahn MG, Callahan TJ, Barnard J, et al. A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. EGEMS. 2016;4(1):1244. doi: 10.13063/2327-9214.1244. - DOI - PMC - PubMed

Publication types

LinkOut - more resources