A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data
- PMID: 37668992
- DOI: 10.1089/big.2022.0086
A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data
Abstract
Over the years, many studies have been carried out to reduce and eliminate the effects of diseases on human health. Gene expression data sets play a critical role in diagnosing and treating diseases. These data sets consist of thousands of genes and a small number of sample sizes. This situation creates the curse of dimensionality and it becomes problematic to analyze such data sets. One of the most effective strategies to solve this problem is feature selection methods. Feature selection is a preprocessing step to improve classification performance by selecting the most relevant and informative features while increasing the accuracy of classification. In this article, we propose a new statistically based filter method for the feature selection approach named Effective Range-based Feature Selection Algorithm (FSAER). As an extension of the previous Effective Range based Gene Selection (ERGS) and Improved Feature Selection based on Effective Range (IFSER) algorithms, our novel method includes the advantages of both methods while taking into account the disjoint area. To illustrate the efficacy of the proposed algorithm, the experiments have been conducted on six benchmark gene expression data sets. The results of the FSAER and the other filter methods have been compared in terms of classification accuracies to demonstrate the effectiveness of the proposed method. For classification methods, support vector machines, naive Bayes classifier, and k-nearest neighbor algorithms have been used.
Keywords: classification methods; effective range; feature selection; filter methods; gene expression data.
© 2023, Mary Ann Liebert, Inc., publishers
Conflict of interest statement
Author Disclosure Statement: No competing financial interests exist.
Similar articles
-
Feature weight estimation for gene selection: a local hyperlinear learning approach.BMC Bioinformatics. 2014 Mar 14;15:70. doi: 10.1186/1471-2105-15-70. BMC Bioinformatics. 2014. PMID: 24625071 Free PMC article.
-
A Tri-Stage Wrapper-Filter Feature Selection Framework for Disease Classification.Sensors (Basel). 2021 Aug 18;21(16):5571. doi: 10.3390/s21165571. Sensors (Basel). 2021. PMID: 34451013 Free PMC article.
-
R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification.Artif Intell Med. 2021 Apr;114:102049. doi: 10.1016/j.artmed.2021.102049. Epub 2021 Mar 6. Artif Intell Med. 2021. PMID: 33875164
-
Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127. BMC Complement Altern Med. 2012. PMID: 22898352 Free PMC article.
-
An efficient statistical feature selection approach for classification of gene expression data.J Biomed Inform. 2011 Aug;44(4):529-35. doi: 10.1016/j.jbi.2011.01.001. Epub 2011 Jan 15. J Biomed Inform. 2011. PMID: 21241823
References
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials