Comparative Study

. 2017 Dec 19;17(1):174.

doi: 10.1186/s12911-017-0566-6.

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Sherif Sakr¹, Radwa Elshawi², Amjad M Ahmed¹, Waqas T Qureshi³, Clinton A Brawner⁴, Steven J Keteyian¹, Michael J Blaha⁵, Mouaz H Al-Mallah^{6

7}

Affiliations

¹ King AbdulAziz Cardiac Center, Ministry of National Guard, Health Affairs, King Abdulaziz Medical City for National Guard - Health affairs, King Abdullah International Medical Research Center, King Saud bin Abdulaziz University for Health Sciences, Department Mail Code: 1413, P.O. Box 22490, Riyadh, 11426, Kingdom of Saudi Arabia.
² Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
³ Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC, USA.
⁴ Division of Cardiovascular Medicine, Henry Ford Hospital, Detroit, MI, USA.
⁵ Johns Hopkins University, Baltimore, MD, USA.
⁶ King AbdulAziz Cardiac Center, Ministry of National Guard, Health Affairs, King Abdulaziz Medical City for National Guard - Health affairs, King Abdullah International Medical Research Center, King Saud bin Abdulaziz University for Health Sciences, Department Mail Code: 1413, P.O. Box 22490, Riyadh, 11426, Kingdom of Saudi Arabia. AlMallahMo@ngha.med.sa.
⁷ Division of Cardiovascular Medicine, Henry Ford Hospital, Detroit, MI, USA. AlMallahMo@ngha.med.sa.

PMID: 29258510
PMCID: PMC5735871
DOI: 10.1186/s12911-017-0566-6

Comparative Study

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Sherif Sakr et al. BMC Med Inform Decis Mak. 2017.

. 2017 Dec 19;17(1):174.

doi: 10.1186/s12911-017-0566-6.

Authors

Sherif Sakr¹, Radwa Elshawi², Amjad M Ahmed¹, Waqas T Qureshi³, Clinton A Brawner⁴, Steven J Keteyian¹, Michael J Blaha⁵, Mouaz H Al-Mallah^{6

7}

Affiliations

¹ King AbdulAziz Cardiac Center, Ministry of National Guard, Health Affairs, King Abdulaziz Medical City for National Guard - Health affairs, King Abdullah International Medical Research Center, King Saud bin Abdulaziz University for Health Sciences, Department Mail Code: 1413, P.O. Box 22490, Riyadh, 11426, Kingdom of Saudi Arabia.
² Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
³ Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC, USA.
⁴ Division of Cardiovascular Medicine, Henry Ford Hospital, Detroit, MI, USA.
⁵ Johns Hopkins University, Baltimore, MD, USA.
⁶ King AbdulAziz Cardiac Center, Ministry of National Guard, Health Affairs, King Abdulaziz Medical City for National Guard - Health affairs, King Abdullah International Medical Research Center, King Saud bin Abdulaziz University for Health Sciences, Department Mail Code: 1413, P.O. Box 22490, Riyadh, 11426, Kingdom of Saudi Arabia. AlMallahMo@ngha.med.sa.
⁷ Division of Cardiovascular Medicine, Henry Ford Hospital, Detroit, MI, USA. AlMallahMo@ngha.med.sa.

PMID: 29258510
PMCID: PMC5735871
DOI: 10.1186/s12911-017-0566-6

Abstract

Background: Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality).

Methods: We use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used.

Results: Two set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling.

Conclusions: The results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.

Keywords: All-cause mortality; FIT (Henry ford ExercIse testing) project; Machine learning.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

This article does not contain any studies with human participants or animals performed by any of the authors. The FIT project is approved by the IRB (ethics committee) of HFH hospital (IRB #: 5812). Informed consent was waived due to retrospective nature of the study. The consent to participate is not required.

Consent for publication

Not applicable. The manuscript doesn’t contain any individual identifying data.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

**Fig. 1**
The ranking of the variables based on the outcome of the Feature Selection Process

**Fig. 2**
AUC of different models with different percentage of synthetic examples created using SMOTE

**Fig. 3**
The ROC curves of the different machine learning classification models. The models are: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN) and K-Nearest Neighbor (KNN). The results show that without using the SMOTE sampling method (a), BC and BN achieves the highest AUC (0.81) while with using the SMOTE sampling method (b), the KNN model achieves the highest AUC (0.94)

See this image and copyright information in PMC

Cited by

Prediction of Prednisolone Dose Correction Using Machine Learning.
Sato H, Kimura Y, Ohba M, Ara Y, Wakabayashi S, Watanabe H. Sato H, et al. J Healthc Inform Res. 2023 Feb 15;7(1):84-103. doi: 10.1007/s41666-023-00128-3. eCollection 2023 Mar. J Healthc Inform Res. 2023. PMID: 36910914 Free PMC article.
Estimated Artificial Neural Network Modeling of Maximal Oxygen Uptake Based on Multistage 10-m Shuttle Run Test in Healthy Adults.
Park HY, Jung H, Lee S, Kim JW, Cho HL, Nam SS. Park HY, et al. Int J Environ Res Public Health. 2021 Aug 12;18(16):8510. doi: 10.3390/ijerph18168510. Int J Environ Res Public Health. 2021. PMID: 34444259 Free PMC article.
Machine-learning predicts time-series prognosis factors in metastatic prostate cancer patients treated with androgen deprivation therapy.
Saito S, Sakamoto S, Higuchi K, Sato K, Zhao X, Wakai K, Kanesaka M, Kamada S, Takeuchi N, Sazuka T, Imamura Y, Anzai N, Ichikawa T, Kawakami E. Saito S, et al. Sci Rep. 2023 Apr 18;13(1):6325. doi: 10.1038/s41598-023-32987-6. Sci Rep. 2023. PMID: 37072487 Free PMC article.
Using machine-learning approach to distinguish patients with methamphetamine dependence from healthy subjects in a virtual reality environment.
Ding X, Li Y, Li D, Li L, Liu X. Ding X, et al. Brain Behav. 2020 Nov;10(11):e01814. doi: 10.1002/brb3.1814. Epub 2020 Aug 29. Brain Behav. 2020. PMID: 32862513 Free PMC article.
Integrative Interpretation of Cardiopulmonary Exercise Tests for Cardiovascular Outcome Prediction: A Machine Learning Approach.
Cauwenberghs N, Sente J, Van Criekinge H, Sabovčik F, Ntalianis E, Haddad F, Claes J, Claessen G, Budts W, Goetschalckx K, Cornelissen V, Kuznetsova T. Cauwenberghs N, et al. Diagnostics (Basel). 2023 Jun 13;13(12):2051. doi: 10.3390/diagnostics13122051. Diagnostics (Basel). 2023. PMID: 37370946 Free PMC article.

See all "Cited by" articles

References

1. Alpaydin E. Introduction to machine learning. MIT press; 2014. https://mitpress.mit.edu/books/introduction-machine-learning-0.
1. Marsland S. Machine learning: an algorithmic perspective. CRC press; 2015. https://www.crcpress.com/Machine-Learning-An-Algorithmic-Perspective-Sec....
1. Aggarwal CC. Data classification: algorithms and applications. CRC Press; 2014. https://www.crcpress.com/Data-Classification-Algorithms-and-Applications....
1. Mayer-Schonberger V, Cukier K. Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt; 2013. https://www.amazon.com/Big-Data-Revolution-Transform-Think/dp/0544227751.
1. Waljee AK, Higgins PD. Machine learning in medicine: a primer for physicians. Am J Gastroenterol. 2010;105(6):1224. doi: 10.1038/ajg.2010.173. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Affiliations

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Authors

Affiliations

Abstract

Conflict of interest statement

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources