Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023:28:16.
doi: 10.1265/ehpm.22-00106.

Development and validation of ischemic heart disease and stroke prognostic models using large-scale real-world data from Japan

Affiliations

Development and validation of ischemic heart disease and stroke prognostic models using large-scale real-world data from Japan

Shigeto Yoshida et al. Environ Health Prev Med. 2023.

Abstract

Background: Previous cardiovascular risk prediction models in Japan have utilized prospective cohort studies with concise data. As the health information including health check-up records and administrative claims becomes digitalized and publicly available, application of large datasets based on such real-world data can achieve prediction accuracy and support social implementation of cardiovascular disease risk prediction models in preventive and clinical practice. In this study, classical regression and machine learning methods were explored to develop ischemic heart disease (IHD) and stroke prognostic models using real-world data.

Methods: IQVIA Japan Claims Database was searched to include 691,160 individuals (predominantly corporate employees and their families working in secondary and tertiary industries) with at least one annual health check-up record during the identification period (April 2013-December 2018). The primary outcome of the study was the first recorded IHD or stroke event. Predictors were annual health check-up records at the index year-month, comprising demographic characteristics, laboratory tests, and questionnaire features. Four prediction models (Cox, Elnet-Cox, XGBoost, and Ensemble) were assessed in the present study to develop a cardiovascular disease risk prediction model for Japan.

Results: The analysis cohort consisted of 572,971 invididuals. All prediction models showed similarly good performance. The Harrell's C-index was close to 0.9 for all IHD models, and above 0.7 for stroke models. In IHD models, age, sex, high-density lipoprotein, low-density lipoprotein, cholesterol, and systolic blood pressure had higher importance, while in stroke models systolic blood pressure and age had higher importance.

Conclusion: Our study analyzed classical regression and machine learning algorithms to develop cardiovascular disease risk prediction models for IHD and stroke in Japan that can be applied to practical use in a large population with predictive accuracy.

Keywords: Ischemic heart disease; Machine learning; Real-world data; Risk prediction model; Stroke.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Fig. 1
Fig. 1
Data extraction IHD: Ischemic heart disease
Fig. 2
Fig. 2
Cumulative incidence of the outcome IHD: Ischemic heart disease; mo: Months Y-axis (Cumulative incidence) equals to 1 − survival probability. Number of individuals at risk for each diagnosis was shown below the figure.
Fig. 3
Fig. 3
Calibration plot of IHD models Calibration plot showning observed vs. predicted risk of the model. Observations were divided into groups of roughly same size based on the risk score (linear predictor) computed from the model. The observed risk and its 95% confidence intervals were computed from the Poisson GLM.
Fig. 4
Fig. 4
Calibration plot of the stroke models Calibration plot showning observed vs. predicted risk of the model. Observations were divided into groups of roughly same size based on the risk score (linear predictor) computed from the model. The observed risk and its 95% confidence intervals were computed from the Poisson GLM.
Fig. 5
Fig. 5
Feature importance for IHD and stroke models AST: Aspartate transaminase; ALT: Alanine transaminase; BMI: Body mass index; CKD: Chronic kidney disease; Cox: Cox proportional-hazards model; HbA1c: Glycosalted hemoglobin; GPT: Glutamate pyruvate transaminase; GOT: Glutamate oxalacetate transaminase; HDL: High-density lipoprotein; LDL: Low-density lipoprotein; XGBoost: Extreme gradient boost. Feature was arranged by the mean importance score accorss the models in descending order.
Fig. 6
Fig. 6
Accumulated local effects plots for IHD model BMI: Body mass index; Cox: Cox proportional-hazards model; HbA1c: Glycosalted hemoglobin; γGTP: Gamma-glutamyl transpeptidase; HDL: High-density lipoprotein; LDL: Low-density lipoprotein; XGBoost: Extreme gradient boost. Only top 10 important features were included in the figure.
Fig. 7
Fig. 7
Accumulated local effects plots for stroke model AST: Aspartate transaminase; ALT: Alanine transaminase; BMI: Body mass index; Cox: Cox proportional-hazards model; HbA1c: Glycosalted hemoglobin; HDL: High-density lipoprotein; XGBoost: Extreme gradient boost. Only top 10 important features were included in the figure.

Similar articles

Cited by

References

    1. WHO. World Health Organization; The top 10 causes of death. Secondary World Health Organization; The top 10 causes of death 2020. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. 2020. (Accessed 08-Jan-2021).
    1. Hata J, Kiyohara Y. Epidemiology of Stroke and Coronary Artery Disease in Asia. Circ J. 2013;77:1923–32. doi: 10.1253/circj.CJ-13-0786. - DOI - PubMed
    1. Arnett DK, Blumenthal RS, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, et al.. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation. 2019;140:e596–646. doi: 10.1161/cir.0000000000000678. - DOI - PMC - PubMed
    1. Hippisley-Cox J, Coupland C, Brindle P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ. 2017;357:j2099. doi: 10.1136/bmj.j2099. - DOI - PMC - PubMed
    1. Arima H, Yonemoto K, Doi Y, Ninomiya T, Hata J, Tanizaki Y, et al.. Development and validation of a cardiovascular risk prediction model for Japanese: the Hisayama study. Hypertens Res. 2009;32:1119–22. doi: 10.1038/hr.2009.161. - DOI - PubMed