Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 7;12(Suppl1):20190015.
doi: 10.1515/scid-2019-0015. eCollection 2020 Sep 1.

Errors in multiple variables in human immunodeficiency virus (HIV) cohort and electronic health record data: statistical challenges and opportunities

Affiliations

Errors in multiple variables in human immunodeficiency virus (HIV) cohort and electronic health record data: statistical challenges and opportunities

Bryan E Shepherd et al. Stat Commun Infect Dis. .

Abstract

Objectives: Observational data derived from patient electronic health records (EHR) data are increasingly used for human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) research. There are challenges to using these data, in particular with regards to data quality; some are recognized, some unrecognized, and some recognized but ignored. There are great opportunities for the statistical community to improve inference by incorporating validation subsampling into analyses of EHR data.Methods: Methods to address measurement error, misclassification, and missing data are relevant, as are sampling designs such as two-phase sampling. However, many of the existing statistical methods for measurement error, for example, only address relatively simple settings, whereas the errors seen in these datasets span multiple variables (both predictors and outcomes), are correlated, and even affect who is included in the study.Results/Conclusion: We will discuss some preliminary methods in this area with a particular focus on time-to-event outcomes and outline areas of future research.

Keywords: HIV; electronic health records; measurement error; misclassification; two-phase sampling.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Authors state no conflict of interest.

Similar articles

Cited by

References

    1. Alexeeff S. E., Carroll R. J., Coull B. Spatial Measurement Error and Correction by Spatial SIMEX in Linear Regression Models when Using Predicted Air Pollution Exposures. Biostatistics . 2016;17:377–89. doi: 10.1093/biostatistics/kxv048. - DOI - PMC - PubMed
    1. Amorim G., Tao R., Lotspeich S., Shaw P., Lumley T., Shepherd B. Two-Phase Sampling Designs for Data Validation in Settings with Measurement Error. 2020 submitted. - PMC - PubMed
    1. Balasubramanian R., Lagakos S. Estimation of a Failure Time Distribution Based on Imperfect Diagnostic Tests. Biometrika . 2003;90:171–82. doi: 10.1093/biomet/90.1.171. - DOI
    1. Bartlett J. W., Keogh R. H. Bayesian Correction for Covariate Measurement Error: A Frequentist Evaluation and Comparison with Regression Calibration. Statistical Methods in Medical Research . 2018;27:1695–708. doi: 10.1177/0962280216667764. - DOI - PubMed
    1. Boe L. A., Tinker L. F., Shaw P. A. An Approximate Quasi-Likelihood Approach for Error-Prone Failure Time Outcomes and Exposures. 2020 arXiv preprint arXiv:2004.01112. - PMC - PubMed

LinkOut - more resources