Development and validation of ischemic heart disease and stroke prognostic models using large-scale real-world data from Japan
- PMID: 36792224
- PMCID: PMC9989775
- DOI: 10.1265/ehpm.22-00106
Development and validation of ischemic heart disease and stroke prognostic models using large-scale real-world data from Japan
Abstract
Background: Previous cardiovascular risk prediction models in Japan have utilized prospective cohort studies with concise data. As the health information including health check-up records and administrative claims becomes digitalized and publicly available, application of large datasets based on such real-world data can achieve prediction accuracy and support social implementation of cardiovascular disease risk prediction models in preventive and clinical practice. In this study, classical regression and machine learning methods were explored to develop ischemic heart disease (IHD) and stroke prognostic models using real-world data.
Methods: IQVIA Japan Claims Database was searched to include 691,160 individuals (predominantly corporate employees and their families working in secondary and tertiary industries) with at least one annual health check-up record during the identification period (April 2013-December 2018). The primary outcome of the study was the first recorded IHD or stroke event. Predictors were annual health check-up records at the index year-month, comprising demographic characteristics, laboratory tests, and questionnaire features. Four prediction models (Cox, Elnet-Cox, XGBoost, and Ensemble) were assessed in the present study to develop a cardiovascular disease risk prediction model for Japan.
Results: The analysis cohort consisted of 572,971 invididuals. All prediction models showed similarly good performance. The Harrell's C-index was close to 0.9 for all IHD models, and above 0.7 for stroke models. In IHD models, age, sex, high-density lipoprotein, low-density lipoprotein, cholesterol, and systolic blood pressure had higher importance, while in stroke models systolic blood pressure and age had higher importance.
Conclusion: Our study analyzed classical regression and machine learning algorithms to develop cardiovascular disease risk prediction models for IHD and stroke in Japan that can be applied to practical use in a large population with predictive accuracy.
Keywords: Ischemic heart disease; Machine learning; Real-world data; Risk prediction model; Stroke.
Conflict of interest statement
None declared.
Figures







Similar articles
-
Development and validation of modified risk prediction models for cardiovascular disease and its subtypes: The Hisayama Study.Atherosclerosis. 2018 Dec;279:38-44. doi: 10.1016/j.atherosclerosis.2018.10.014. Epub 2018 Oct 17. Atherosclerosis. 2018. PMID: 30408715
-
Development and validation of a prediction model for people with mild chronic kidney disease in Japanese individuals.BMC Nephrol. 2024 Oct 9;25(1):339. doi: 10.1186/s12882-024-03786-6. BMC Nephrol. 2024. PMID: 39385081 Free PMC article.
-
Importance of high-density lipoprotein cholesterol levels in elderly diabetic individuals with type IIb dyslipidemia: A 2-year survey of cardiovascular events.Geriatr Gerontol Int. 2014 Oct;14(4):806-10. doi: 10.1111/ggi.12168. Epub 2013 Nov 12. Geriatr Gerontol Int. 2014. PMID: 24215618
-
The effect of exposure to long working hours on ischaemic heart disease: A systematic review and meta-analysis from the WHO/ILO Joint Estimates of the Work-related Burden of Disease and Injury.Environ Int. 2020 Sep;142:105739. doi: 10.1016/j.envint.2020.105739. Epub 2020 Jun 5. Environ Int. 2020. PMID: 32505014 Free PMC article.
-
A Systematic Review of Case-Identification Algorithms Based on Italian Healthcare Administrative Databases for Three Relevant Diseases of the Cardiovascular System: Acute Myocardial Infarction, Ischemic Heart Disease, and Stroke.Epidemiol Prev. 2019 Jul-Aug;43(4 Suppl 2):37-50. doi: 10.19191/EP19.4.S2.P037.091. Epidemiol Prev. 2019. PMID: 31650805
Cited by
-
Metabolic syndrome and depression: evidence from a cross-sectional study of real-world data in Japan.Environ Health Prev Med. 2024;29:33. doi: 10.1265/ehpm.23-00369. Environ Health Prev Med. 2024. PMID: 38960635 Free PMC article.
-
Transforming Cardiovascular Risk Prediction: A Review of Machine Learning and Artificial Intelligence Innovations.Life (Basel). 2025 Jan 14;15(1):94. doi: 10.3390/life15010094. Life (Basel). 2025. PMID: 39860034 Free PMC article. Review.
-
Development of new scores for atherosclerotic cardiovascular disease using specific medical examination items: the Suita Study.Environ Health Prev Med. 2023;28:61. doi: 10.1265/ehpm.23-00099. Environ Health Prev Med. 2023. PMID: 37899208 Free PMC article.
-
Construction of a prognostic prediction model for colorectal cancer based on 5-year clinical follow-up data.Sci Rep. 2025 Jan 21;15(1):2701. doi: 10.1038/s41598-025-86872-5. Sci Rep. 2025. PMID: 39838027 Free PMC article.
-
Self-reported eating habits and dyslipidemia in men aged 20-39 years: the Japan Environment and Children's Study.Environ Health Prev Med. 2023;28:41. doi: 10.1265/ehpm.23-00008. Environ Health Prev Med. 2023. PMID: 37407489 Free PMC article.
References
-
- WHO. World Health Organization; The top 10 causes of death. Secondary World Health Organization; The top 10 causes of death 2020. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. 2020. (Accessed 08-Jan-2021).
-
- Arnett DK, Blumenthal RS, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, et al.. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation. 2019;140:e596–646. doi: 10.1161/cir.0000000000000678. - DOI - PMC - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Research Materials
Miscellaneous