Cancer Classification Utilizing Voting Classifier with Ensemble Feature Selection Method and Transcriptomic Data
- PMID: 37761941
- PMCID: PMC10530870
- DOI: 10.3390/genes14091802
Cancer Classification Utilizing Voting Classifier with Ensemble Feature Selection Method and Transcriptomic Data
Abstract
Biomarker-based cancer identification and classification tools are widely used in bioinformatics and machine learning fields. However, the high dimensionality of microarray gene expression data poses a challenge for identifying important genes in cancer diagnosis. Many feature selection algorithms optimize cancer diagnosis by selecting optimal features. This article proposes an ensemble rank-based feature selection method (EFSM) and an ensemble weighted average voting classifier (VT) to overcome this challenge. The EFSM uses a ranking method that aggregates features from individual selection methods to efficiently discover the most relevant and useful features. The VT combines support vector machine, k-nearest neighbor, and decision tree algorithms to create an ensemble model. The proposed method was tested on three benchmark datasets and compared to existing built-in ensemble models. The results show that our model achieved higher accuracy, with 100% for leukaemia, 94.74% for colon cancer, and 94.34% for the 11-tumor dataset. This study concludes by identifying a subset of the most important cancer-causing genes and demonstrating their significance compared to the original data. The proposed approach surpasses existing strategies in accuracy and stability, significantly impacting the development of ML-based gene analysis. It detects vital genes with higher precision and stability than other existing methods.
Keywords: cancer detection; feature selection; gene analysis; gene data; machine learning; voting classifier.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
References
-
- Talukder M.A., Islam M.M., Uddin M.A., Akhter A., Pramanik M.A.J., Aryal S., Almoyad M.A.A., Hasan K.F., Moni M.A. An efficient deep learning model to categorize brain tumor using reconstruction and fine-tuning. Expert Syst. Appl. 2023:120534.
-
- Talukder M.A., Islam M.M., Uddin M.A., Akhter A., Hasan K.F., Moni M.A. Machine learning-based lung and colon cancer detection using deep feature extraction and ensemble learning. Expert Syst. Appl. 2022;205:117695.
-
- Sharmin S., Ahammad T., Talukder M.A., Ghose P. A Hybrid Dependable Deep Feature Extraction and Ensemble-based Machine Learning Approach for Breast Cancer Detection. IEEE Access. 2023;11:87694–87708. doi: 10.1109/ACCESS.2023.3304628. - DOI
-
- World Health Organization Media Centre . Cancer Fact Sheet. World Health Organization; Geneva, Switzerland: 2020.
-
- Horng J.T., Wu L.C., Liu B.J., Kuo J.L., Kuo W.H., Zhang J.J. An expert system to classify microarray gene expression data using gene selection by decision tree. Expert Syst. Appl. 2009;36:9072–9081.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
