Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 5;3(4):583-592.
doi: 10.1093/jamiaopen/ooaa059. eCollection 2020 Dec.

Cardiovascular disease risk prediction for people with type 2 diabetes in a population-based cohort and in electronic health record data

Affiliations

Cardiovascular disease risk prediction for people with type 2 diabetes in a population-based cohort and in electronic health record data

Jackie Szymonifka et al. JAMIA Open. .

Abstract

Objective: Electronic health records (EHRs) have become a common data source for clinical risk prediction, offering large sample sizes and frequently sampled metrics. There may be notable differences between hospital-based EHR and traditional cohort samples: EHR data often are not population-representative random samples, even for particular diseases, as they tend to be sicker with higher healthcare utilization, while cohort studies often sample healthier subjects who typically are more likely to participate. We investigate heterogeneities between EHR- and cohort-based inferences including incidence rates, risk factor identifications/quantifications, and absolute risks.

Materials and methods: This is a retrospective cohort study of older patients with type 2 diabetes using EHR from New York University Langone Health ambulatory care (NYULH-EHR, years 2009-2017) and from the Health and Retirement Survey (HRS, 1995-2014) to study subsequent cardiovascular disease (CVD) risks. We used the same eligibility criteria, outcome definitions, and demographic covariates/biomarkers in both datasets. We compared subsequent CVD incidence rates, hazard ratios (HRs) of risk factors, and discrimination/calibration performances of CVD risk scores.

Results: The estimated subsequent total CVD incidence rate was 37.5 and 90.6 per 1000 person-years since T2DM onset in HRS and NYULH-EHR respectively. HR estimates were comparable between the datasets for most demographic covariates/biomarkers. Common CVD risk scores underestimated observed total CVD risks in NYULH-EHR.

Discussion and conclusion: EHR-estimated HRs of demographic and major clinical risk factors for CVD were mostly consistent with the estimates from a national cohort, despite high incidences and absolute risks of total CVD outcome in the EHR samples.

Keywords: cardiovascular disease; cohort analysis; electronic health records; risk factors; type 2 diabetes mellitus.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Inclusion criteria and study flow chart for (A) NYULH-EHR T2DM patients and (B) HRS T2DM respondents. (A) Inclusion criteria for the NYULH-EHR cohort. Among patients seen in the New York University Langone Health ambulatory care clinic between 1995 and 2014 who met the eligibility criteria outlined in Supplementary Figure S1, we first limited the analysis cohort to encounters with patients 50 years of age or older. We then reduced the eligible cohort to patients ≥ 50 years of age who had T2DM, as defined in Supplementary Figure S1. We removed patients who met the criteria for T2DM status at initial encounter since date of initial diagnosis could not be reliably estimated. Finally, we removed patients who already met the criteria for CVD diagnosis, so that subsequent incident CVD cases could be identified. (B) Inclusion criteria for the HRS cohort. Among respondents to the HRS between 1992 and 2014, we first limited the analysis cohort to respondents who were 50 years of age or older during at least one interview. We then reduced the eligible cohort to respondents age ≥ 50 years with self-reported and subsequently adjudicated T2DM. We also eliminated T2DM cases that were self-reported at the first interview since date of initial diagnosis could not be reliably estimated. Finally, we removed respondents with self-reported, and subsequently adjudicated, CVD or stroke at or before the first interview at which T2DM was reported so that subsequent incident CVD cases could be identified.
Figure 2.
Figure 2.
Calibration plots comparing predicted and observed 5-year risks. (A) HRS cohort using Framingham risk score to predict total CVD outcome; (B) NYULH-EHR cohort using Framingham risk score to predict total CVD outcome; (C) NYULH-EHR cohort using ACC/AHA pooled cohort equations to predict hard CVD outcome; (D) NYULH-EHR cohort using Swedish NDR to predict hard CVD outcome. Risk factors included in the FRS global CVD function and the ACC/AHA function are age, TC, HDL, SBP, treatment for hypertension, smoking, and T2DM status (all yes). Sex-specific risk equations were applied to males and females separately. For the ACC/AHA risk score, African American (AA) coefficients were used for AAs and white coefficients were used for all other patients. We replaced 10-year baseline survival estimates with 5-year estimates by assuming exponential survival distributions to align with the available follow-up of the present cohorts. Risk factors included in the Swedish NDR risk prediction functions include onset age of T2DM, T2DM duration, sex, BMI, smoking, HbA1c, SBP, and antihypertensive and lipid-reducing drugs. We divided the cohorts into deciles of the predicted risk and calculated the mean predicted risk value within each decile. We calculated the observed risk from the Kaplan–Meier estimate within each decile and plotted the observed vs. predicted risk functions.

Similar articles

Cited by

References

    1. Nichols GA, Desai J, Elston Lafata J, et al.; SUPREME-DM Study Group. Construction of a multisite DataLink using electronic health records for the identification, surveillance, prevention, and management of diabetes mellitus: the SUPREME-DM project. Prev Chronic Dis 2012; 9: E110. - PMC - PubMed
    1. Catalan-Ramos A, Verdu JM, Grau M, et al.; GPC-ICS Group. Population prevalence and control of cardiovascular risk factors: what electronic medical records tell us. Aten Primaria 2014; 46 (1): 15–24. - PMC - PubMed
    1. Sidebottom AC, Johnson PJ, VanWormer JJ, Sillah A, Winden TJ, Boucher JL.. Exploring electronic health records as a population health surveillance tool of cardiovascular disease risk factors. Popul Health Manag 2015; 18 (2): 79–85. - PubMed
    1. Goldstein BA, Navar AM, Pencina MJ, Ioannidis JP.. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 2017; 24 (1): 198–208. - PMC - PubMed
    1. Goldstein BA, Bhavsar NA, Phelan M, Pencina MJ.. Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am J Epidemiol 2016; 184 (11): 847–55. - PMC - PubMed

LinkOut - more resources