Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec:52:105-11.
doi: 10.1016/j.jbi.2014.08.012. Epub 2014 Sep 6.

Evaluation of matched control algorithms in EHR-based phenotyping studies: a case study of inflammatory bowel disease comorbidities

Affiliations

Evaluation of matched control algorithms in EHR-based phenotyping studies: a case study of inflammatory bowel disease comorbidities

Victor M Castro et al. J Biomed Inform. 2014 Dec.

Abstract

The success of many population studies is determined by proper matching of cases to controls. Some of the confounding and bias that afflict electronic health record (EHR)-based observational studies may be reduced by creating effective methods for finding adequate controls. We implemented a method to match case and control populations to compensate for sparse and unequal data collection practices common in EHR data. We did this by matching the healthcare utilization of patients after observing that more complete data was collected on high healthcare utilization patients vs. low healthcare utilization patients. In our results, we show that many of the anomalous differences in population comparisons are mitigated using this matching method compared to other traditional age and gender-based matching. As an example, the comparison of the disease associations of ulcerative colitis and Crohn's disease show differences that are not present when the controls are chosen in a random or even a matched age/gender/race algorithm. In conclusion, the use of healthcare utilization-based matching algorithms to find adequate controls greatly enhanced the accuracy of results in EHR studies. Full source code and documentation of the control matching methods is available at https://community.i2b2.org/wiki/display/conmat/.

Keywords: Comorbidity; Controls; EHR; Inflammatory bowel disease; Matching.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Matching algorithms flow diagram
Figure 2
Figure 2
Distribution of Crohn’s disease (CD) comorbidity associations with different matched control patients. Each line represents a case-control comparison across 806 comorbidities using unmatched controls (CD97_Random), controls matched on age, gender and race (CD97_AGR), controls matched on observation frequency and period in the EHR (CD97_NFL) and controls matched on all factors (CD97_AGRNFL). The y-axis represents the proportion of comorbidities meeting statistical significance (Bonferroni-adjusted) at the relative risk (RR). Lines are smoothed using a Gaussian kernel function (Density).
Figure 3
Figure 3
Distribution of Ulcerative colitis (UC) comorbidity associations with different matched control patients. Each line represents a case-control comparison across 806 comorbidities using unmatched controls (UC97_Random), controls matched on age, gender and race (UC97_AGR), controls matched on observation frequency and period in the EHR (UC97_NFL) and controls matched on all factors (UC97_AGRNFL). The y-axis represents the proportion of comorbidities meeting statistical significance (Bonferroni-adjusted) at the relative risk (RR). Lines are smoothed using a Gaussian kernel function (Density).

References

    1. DesRoches CM, Charles D, Furukawa MF, Joshi MS, Kralovec P, Mostashari F, et al. Adoption Of Electronic Health Records Grows Rapidly, But Fewer Than Half Of US Hospitals Had At Least A Basic System In 2012. Health Affairs. 2013 - PubMed
    1. Jha AK, DesRoches CM, Campbell EG, Donelan K, Rao SR, Ferris TG, et al. Use of electronic health records in US hospitals. New England Journal of Medicine. 2009;360:1628–38. - PubMed
    1. Denny JC. Chapter 13: Mining Electronic Health Records in the Genomics Era. PLoS Comput Biol. 2012;8:e1002823. - PMC - PubMed
    1. Jha AK. The Promise of Electronic Records. JAMA: The Journal of the American Medical Association. 2011;306:880–1. - PubMed
    1. Ryan PB, Madigan D, Stang PE, Marc Overhage J, Racoosin JA, Hartzema AG. Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the Observational Medical Outcomes Partnership. Statistics in Medicine. 2012;31:4401–15. - PubMed

Publication types

LinkOut - more resources