Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data
- PMID: 35016102
- DOI: 10.1016/j.compbiomed.2021.105208
Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data
Abstract
Microarray technology facilitates the simultaneous measurement of expression of tens of thousands of genes and enables us to study cancers and tumors at the molecular level. Because microarray data are typically characterized by small sample size and high dimensionality, accurate and stable feature selection is thus of fundamental importance to the diagnostic accuracy and deep understanding of disease mechanism. Hence, we in this study present an ensemble feature selection framework to improve the discrimination and stability of finally selected features. Specifically, we utilize sampling techniques to obtain multiple sampled datasets, from each of which we use a base feature selector to select a subset of features. Afterwards, we develop two aggregation strategies to combine multiple feature subsets into one set. Finally, comparative experiments are conducted on four publicly available microarray datasets covering both binary and multi-class cases in terms of classification accuracy and three stability metrics. Results show that the proposed method obtains better stability scores and achieves comparable to and even better classification performance than its competitors.
Keywords: Ensemble learning; Feature selection; Gene expression profiles; Stability.
Copyright © 2022 Elsevier Ltd. All rights reserved.
Similar articles
-
A two-stage hybrid biomarker selection method based on ensemble filter and binary differential evolution incorporating binary African vultures optimization.BMC Bioinformatics. 2023 Apr 4;24(1):130. doi: 10.1186/s12859-023-05247-7. BMC Bioinformatics. 2023. PMID: 37016297 Free PMC article.
-
A multi-classification deep neural network for cancer type identification from high-dimension, small-sample and imbalanced gene microarray data.Sci Rep. 2025 Feb 12;15(1):5239. doi: 10.1038/s41598-025-89475-2. Sci Rep. 2025. PMID: 39939378 Free PMC article.
-
Robust biomarker identification for cancer diagnosis with ensemble feature selection methods.Bioinformatics. 2010 Feb 1;26(3):392-8. doi: 10.1093/bioinformatics/btp630. Epub 2009 Nov 25. Bioinformatics. 2010. PMID: 19942583
-
Filter versus wrapper gene selection approaches in DNA microarray domains.Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007. Artif Intell Med. 2004. PMID: 15219288 Review.
-
Classification of breast cancer using microarray gene expression data: A survey.J Biomed Inform. 2021 May;117:103764. doi: 10.1016/j.jbi.2021.103764. Epub 2021 Apr 6. J Biomed Inform. 2021. PMID: 33831535 Review.
Cited by
-
A comprehensive learning based swarm optimization approach for feature selection in gene expression data.Heliyon. 2024 Sep 2;10(17):e37165. doi: 10.1016/j.heliyon.2024.e37165. eCollection 2024 Sep 15. Heliyon. 2024. PMID: 39296018 Free PMC article.
-
Comparative Study of Classification Algorithms for Various DNA Microarray Data.Genes (Basel). 2022 Mar 11;13(3):494. doi: 10.3390/genes13030494. Genes (Basel). 2022. PMID: 35328048 Free PMC article.
-
Enhancing Cancerous Gene Selection and Classification for High-Dimensional Microarray Data Using a Novel Hybrid Filter and Differential Evolutionary Feature Selection.Cancers (Basel). 2024 Nov 22;16(23):3913. doi: 10.3390/cancers16233913. Cancers (Basel). 2024. PMID: 39682102 Free PMC article.
-
Deep learning assisted cancer disease prediction from gene expression data using WT-GAN.BMC Med Inform Decis Mak. 2024 Oct 24;24(1):311. doi: 10.1186/s12911-024-02712-y. BMC Med Inform Decis Mak. 2024. PMID: 39449042 Free PMC article.
-
Radiomics for Discrimination between Early-Stage Nasopharyngeal Carcinoma and Benign Hyperplasia with Stable Feature Selection on MRI.Cancers (Basel). 2022 Jul 14;14(14):3433. doi: 10.3390/cancers14143433. Cancers (Basel). 2022. PMID: 35884494 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical