A comprehensive learning based swarm optimization approach for feature selection in gene expression data
- PMID: 39296018
- PMCID: PMC11408137
- DOI: 10.1016/j.heliyon.2024.e37165
A comprehensive learning based swarm optimization approach for feature selection in gene expression data
Abstract
Gene expression data analysis is challenging due to the high dimensionality and complexity of the data. Feature selection, which identifies relevant genes, is a common preprocessing step. We propose a Comprehensive Learning-Based Swarm Optimization (CLBSO) approach for feature selection in gene expression data. CLBSO leverages the strengths of ants and grasshoppers to efficiently explore the high-dimensional search space. Ants perform local search and leave pheromone trails to guide the swarm, while grasshoppers use their ability to jump long distances to explore new regions and avoid local optima. The proposed approach was evaluated on several publicly available gene expression datasets and compared with state-of-the-art feature selection methods. CLBSO achieved an average accuracy improvement of 15% over the original high-dimensional data and outperformed other feature selection methods by up to 10%. For instance, in the Pancreatic cancer dataset, CLBSO achieved 97.2% accuracy, significantly higher than XGBoost-MOGA's 84.0%. Convergence analysis showed CLBSO required fewer iterations to reach optimal solutions. Statistical analysis confirmed significant performance improvements, and stability analysis demonstrated consistent gene subset selection across different runs. These findings highlight the robustness and efficacy of CLBSO in handling complex gene expression datasets, making it a valuable tool for enhancing classification tasks in bioinformatics.
Keywords: Cancer classification; Comprehensive learning; Feature selection; Gene expression; Gene selection; Swarm intelligence.
© 2024 The Author(s).
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures







References
-
- Sharafi Y., Teshnehlab M., Aria M.M. A self-adaptive binary cat swarm optimization using new time-varying transfer function for gene selection in dna microarray expression cancer data. Soft Comput. 2023;4 doi: 10.1007/s00500-023-07988-2. - DOI
-
- Pashaei E., Pashaei E. Hybrid binary coot algorithm with simulated annealing for feature selection in high-dimensional microarray data. Neural Comput. Appl. 2023;35:353–374. doi: 10.1007/s00521-022-07780-7. - DOI
-
- Ibrahim R.A., Ewees A.A., Oliva D., Elaziz M.A., Lu S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient Intell. Humaniz. Comput. 2019;10:3155–3169. doi: 10.1007/s12652-018-1031-9. - DOI
-
- Maayah B., Arqub O.A. Uncertain m-fractional differential problems: existence, uniqueness, and approximations using Hilbert reproducing technique provisioner with the case application: series resistor-inductor circuit. Phys. Scr. 2024;99(2) doi: 10.1088/1402-4896/ad1738. doi: 10.1088/1402-4896/ad1738. - DOI - DOI
LinkOut - more resources
Full Text Sources