SVM-CART for Disease Classification
- PMID: 33012942
- PMCID: PMC7531767
- DOI: 10.1080/02664763.2019.1625876
SVM-CART for Disease Classification
Abstract
Classification and regression trees (CART) and support vector machines (SVM) have become very popular statistical learning tools for analyzing complex data that often arise in biomedical research. While both CART and SVM serve as powerful classifiers in many clinical settings, there are some common scenarios in which each fails to meet the performance and interpretability needed for use as a clinical decision-making tool. In this paper, we propose a new classification method, SVM-CART, that combines features of SVM and CART to produce a more flexible classifier that has the potential to outperform either method in terms of interpretability and prediction accuracy. Further-more, to enhance prediction accuracy we provide extensions of a single SVM-CART to an ensemble, and methods to extract a representative classifier from the SVM-CART ensemble. The goal is to produce a decision-making tool that can be used in the clinical setting, while still harnessing the stability and predictive improvements gained through developing the SVM-CART ensemble. An extensive simulation study is conducted to asses the performance of the methods in various settings. Finally, we illustrate our methods using a clinical neuropathy dataset.
Keywords: Classification and Regression Trees; Complex Interactions; Ensemble Classifiers; Statistical Learning; Support Vector Machines.
Figures






Similar articles
-
SVM and SVM Ensembles in Breast Cancer Prediction.PLoS One. 2017 Jan 6;12(1):e0161501. doi: 10.1371/journal.pone.0161501. eCollection 2017. PLoS One. 2017. PMID: 28060807 Free PMC article.
-
Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.J Neurosci Methods. 2018 May 15;302:66-74. doi: 10.1016/j.jneumeth.2018.01.003. Epub 2018 Feb 3. J Neurosci Methods. 2018. PMID: 29378218
-
Vicinal support vector classifier using supervised kernel-based clustering.Artif Intell Med. 2014 Mar;60(3):189-96. doi: 10.1016/j.artmed.2014.01.003. Epub 2014 Feb 7. Artif Intell Med. 2014. PMID: 24637294
-
Reviewing ensemble classification methods in breast cancer.Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20. Comput Methods Programs Biomed. 2019. PMID: 31319964 Review.
-
Class-imbalanced classifiers for high-dimensional data.Brief Bioinform. 2013 Jan;14(1):13-26. doi: 10.1093/bib/bbs006. Epub 2012 Mar 9. Brief Bioinform. 2013. PMID: 22408190 Review.
Cited by
-
Research on Sleep Staging Based on Support Vector Machine and Extreme Gradient Boosting Algorithm.Nat Sci Sleep. 2024 Nov 26;16:1827-1847. doi: 10.2147/NSS.S467111. eCollection 2024. Nat Sci Sleep. 2024. PMID: 39629225 Free PMC article.
-
Strategies for overcoming data scarcity, imbalance, and feature selection challenges in machine learning models for predictive maintenance.Sci Rep. 2024 Apr 26;14(1):9645. doi: 10.1038/s41598-024-59958-9. Sci Rep. 2024. PMID: 38671068 Free PMC article.
-
Effects of environmental phenols on eGFR: machine learning modeling methods applied to cross-sectional studies.Front Public Health. 2024 Aug 1;12:1405533. doi: 10.3389/fpubh.2024.1405533. eCollection 2024. Front Public Health. 2024. PMID: 39148651 Free PMC article.
-
Altered Fractional Amplitude of Low-Frequency Fluctuation in Major Depressive Disorder and Bipolar Disorder.Front Psychiatry. 2021 Oct 13;12:739210. doi: 10.3389/fpsyt.2021.739210. eCollection 2021. Front Psychiatry. 2021. PMID: 34721109 Free PMC article.
-
Use of machine learning to predict medication adherence in individuals at risk for atherosclerotic cardiovascular disease.Smart Health (Amst). 2022 Dec;26:100328. doi: 10.1016/j.smhl.2022.100328. Epub 2022 Oct 4. Smart Health (Amst). 2022. PMID: 37169026 Free PMC article.
References
-
- Breiman L, Friedman JH, Olshen RA, and Stone CJ Classification and Regression Trees Belmont, California: Wadsworth; 1984.
-
- Zhang H and Singer B Recursive Partitioning in the Health Sciences Springer: New York: 1999.
-
- Hastie T, Tibshirani R and Friedman J The Elements of Statistical Learning Springer: New York: 2001.
-
- Cutler A, Cutler DR and Stevens JR “Tree-based methods”. High-Dimensional Data Analysis in Cancer Research, 2009; 24, 123–140.
Grants and funding
LinkOut - more resources
Full Text Sources