Machine learning models in breast cancer survival prediction
- PMID: 26409558
- DOI: 10.3233/THC-151071
Machine learning models in breast cancer survival prediction
Abstract
Background: Breast cancer is one of the most common cancers with a high mortality rate among women. With the early diagnosis of breast cancer survival will increase from 56% to more than 86%. Therefore, an accurate and reliable system is necessary for the early diagnosis of this cancer. The proposed model is the combination of rules and different machine learning techniques. Machine learning models can help physicians to reduce the number of false decisions. They try to exploit patterns and relationships among a large number of cases and predict the outcome of a disease using historical cases stored in datasets.
Objective: The objective of this study is to propose a rule-based classification method with machine learning techniques for the prediction of different types of Breast cancer survival.
Methods: We use a dataset with eight attributes that include the records of 900 patients in which 876 patients (97.3%) and 24 (2.7%) patients were females and males respectively. Naive Bayes (NB), Trees Random Forest (TRF), 1-Nearest Neighbor (1NN), AdaBoost (AD), Support Vector Machine (SVM), RBF Network (RBFN), and Multilayer Perceptron (MLP) machine learning techniques with 10-cross fold technique were used with the proposed model for the prediction of breast cancer survival. The performance of machine learning techniques were evaluated with accuracy, precision, sensitivity, specificity, and area under ROC curve.
Results: Out of 900 patients, 803 patients and 97 patients were alive and dead, respectively. In this study, Trees Random Forest (TRF) technique showed better results in comparison to other techniques (NB, 1NN, AD, SVM and RBFN, MLP). The accuracy, sensitivity and the area under ROC curve of TRF are 96%, 96%, 93%, respectively. However, 1NN machine learning technique provided poor performance (accuracy 91%, sensitivity 91% and area under ROC curve 78%).
Conclusions: This study demonstrates that Trees Random Forest model (TRF) which is a rule-based classification model was the best model with the highest level of accuracy. Therefore, this model is recommended as a useful tool for breast cancer survival prediction as well as medical decision making.
Keywords: Breast cancer survival prediction; classification; machine learning models.
Similar articles
-
Prediction of different types of liver diseases using rule based classification model.Technol Health Care. 2013;21(5):417-32. doi: 10.3233/THC-130742. Technol Health Care. 2013. PMID: 23963359
-
Computational Discrimination of Breast Cancer for Korean Women Based on Epidemiologic Data Only.J Korean Med Sci. 2015 Aug;30(8):1025-34. doi: 10.3346/jkms.2015.30.8.1025. Epub 2015 Jul 15. J Korean Med Sci. 2015. PMID: 26240478 Free PMC article.
-
Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines.Comput Methods Programs Biomed. 2014 Mar;113(3):792-808. doi: 10.1016/j.cmpb.2014.01.001. Epub 2014 Jan 10. Comput Methods Programs Biomed. 2014. PMID: 24472367
-
Reviewing ensemble classification methods in breast cancer.Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20. Comput Methods Programs Biomed. 2019. PMID: 31319964 Review.
-
Involvement of Machine Learning for Breast Cancer Image Classification: A Survey.Comput Math Methods Med. 2017;2017:3781951. doi: 10.1155/2017/3781951. Epub 2017 Dec 31. Comput Math Methods Med. 2017. PMID: 29463985 Free PMC article. Review.
Cited by
-
Automated Detection and Scoring of Tumor-Infiltrating Lymphocytes in Breast Cancer Histopathology Slides.Cancers (Basel). 2023 Jul 15;15(14):3635. doi: 10.3390/cancers15143635. Cancers (Basel). 2023. PMID: 37509295 Free PMC article.
-
Synergistic Effects of Genetic Variants of Glucose Homeostasis and Lifelong Exposures to Cigarette Smoking, Female Hormones, and Dietary Fat Intake on Primary Colorectal Cancer Development in African and Hispanic/Latino American Women.Front Oncol. 2021 Oct 7;11:760243. doi: 10.3389/fonc.2021.760243. eCollection 2021. Front Oncol. 2021. PMID: 34692549 Free PMC article.
-
A Novel Hybrid Deep Learning Model for Metastatic Cancer Detection.Comput Intell Neurosci. 2022 Jun 24;2022:8141530. doi: 10.1155/2022/8141530. eCollection 2022. Comput Intell Neurosci. 2022. PMID: 35785076 Free PMC article.
-
Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model.Am J Cancer Res. 2020 May 1;10(5):1344-1355. eCollection 2020. Am J Cancer Res. 2020. PMID: 32509383 Free PMC article.
-
Monkey king evolution (MKE)-GA-SVM model for subtype classification of breast cancer.Digit Health. 2024 Dec 10;10:20552076241297002. doi: 10.1177/20552076241297002. eCollection 2024 Jan-Dec. Digit Health. 2024. PMID: 39659402 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical