Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr;82(4):239-47.
doi: 10.1016/j.ijmedinf.2012.05.015. Epub 2012 Jul 2.

The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects

Affiliations

The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects

Wei-Qi Wei et al. Int J Med Inform. 2013 Apr.

Abstract

Purpose: To evaluate the impact of insufficient longitudinal data on the accuracy of a high-throughput clinical phenotyping (HTCP) algorithm for identifying (1) patients with type 2 diabetes mellitus (T2DM) and (2) patients with no diabetes.

Methods: Retrospective study conducted at Mayo Clinic in Rochester, Minnesota. Eligible subjects were Olmsted County residents with ≥1 Mayo Clinic encounter in each of three time periods: (1) 2007, (2) from 1997 through 2006, and (3) before 1997 (N = 54,283). Diabetes relevant electronic medical record (EMR) data about diagnoses, laboratories, and medications were used. We employed the HTCP algorithm to categorize individuals as T2DM cases and non-diabetes controls. Considering the full 11 years (1997-2007) as the gold standard, we compared gold-standard categorizations with those using data for 10 subsequent intervals, ranging from 1998-2007 (10-year data) to 2007 (1-year data). Positive predictive values (PPVs) and false-negative rates (FNRs) were calculated. McNemar tests were used to determine whether categorizations using shorter time periods differed from the gold standard. Statistical significance was defined as P < 0.05.

Results: We identified 2770 T2DM cases and 21,005 controls when the algorithm was applied using 11-year data. Using 2007 data alone, PPVs and FNRs, respectively, were 70% and 25% for case identification and 59% and 67% for control identification. All time frames differed significantly from the gold standard, except for the 10-year period.

Conclusions: The accuracy of the algorithm reduced remarkably as data were limited to shorter observation periods. This impact should be considered carefully when designing/executing HTCP algorithms.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest

There are no conflicts of interests.

Figures

Figure 1
Figure 1
The eMERGE Algorithm for Identifying T2DM Cases *Random glucose > 200 mg/dl, Fasting glucose > 125 mg/dl, hemoglobin A1c ≥ 6.5% Abbreviations: DM, diabetes mellitus; Dx, diagnosis; eMERGE, Electronic Medical Records and Genomics; HbA1c, hemoglobin A1c; ICD-9-CM, International Classification of Diseases, 9th Revision, Clinical Modification; Rx, prescription; T2DM, type 2 diabetes mellitus; T1DM, type 1 diabetes mellitus.
Figure 2
Figure 2
The eMERGE Algorithm for Identifying non-DM Controls Abbreviations: DM, diabetes mellitus; Dx, diagnosis; eMERGE, Electronic Medical Records and Genomics; HbA1c, hemoglobin A1c; ICD-9-CM, International Classification of Diseases, 9th Revision, Clinical Modification; Rx, prescription; T1DM, type 1 diabetes mellitus; T2DM, type 2 diabetes mellitus.

Similar articles

Cited by

References

    1. Kohane IS. Using electronic health records to drive discovery in disease genomics. Nat Rev Genet. 2011;12(6):417–28. - PubMed
    1. Ritchie MD, et al. Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am J Hum Genet. 2010;86(4):560–72. - PMC - PubMed
    1. Wilke RA, et al. The emerging role of electronic medical records in pharmacogenomics. Clin Pharmacol Ther. 2011;89(3):379–86. - PMC - PubMed
    1. Liao KP, et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res (Hoboken) 62(8):1120–7. - PMC - PubMed
    1. Wilke RA, et al. Characterization of low-density lipoprotein cholesterol-lowering efficacy for atorvastatin in a population-based DNA biorepository. Basic Clin Pharmacol Toxicol. 2008;103(4):354–9. - PubMed

Publication types