An efficient statistical feature selection approach for classification of gene expression data
- PMID: 21241823
- DOI: 10.1016/j.jbi.2011.01.001
An efficient statistical feature selection approach for classification of gene expression data
Abstract
Classification of gene expression data plays a significant role in prediction and diagnosis of diseases. Gene expression data has a special characteristic that there is a mismatch in gene dimension as opposed to sample dimension. All genes do not contribute for efficient classification of samples. A robust feature selection algorithm is required to identify the important genes which help in classifying the samples efficiently. In order to select informative genes (features) based on relevance and redundancy characteristics, many feature selection algorithms have been introduced in the past. Most of the earlier algorithms require computationally expensive search strategy to find an optimal feature subset. Existing feature selection methods are also sensitive to the evaluation measures. The paper introduces a novel and efficient feature selection approach based on statistically defined effective range of features for every class termed as ERGS (Effective Range based Gene Selection). The basic principle behind ERGS is that higher weight is given to the feature that discriminates the classes clearly. Experimental results on well-known gene expression datasets illustrate the effectiveness of the proposed approach. Two popular classifiers viz. Nave Bayes Classifier (NBC) and Support Vector Machine (SVM) have been used for classification. The proposed feature selection algorithm can be helpful in ranking the genes and also is capable of identifying the most relevant genes responsible for diseases like leukemia, colon tumor, lung cancer, diffuse large B-cell lymphoma (DLBCL), prostate cancer.
Copyright © 2011 Elsevier Inc. All rights reserved.
Similar articles
-
A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.Artif Intell Med. 2007 Oct;41(2):161-75. doi: 10.1016/j.artmed.2007.07.008. Epub 2007 Sep 11. Artif Intell Med. 2007. PMID: 17851055
-
Tumor classification ranking from microarray data.BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21. BMC Genomics. 2008. PMID: 18831787 Free PMC article.
-
SVM-RFE with MRMR filter for gene selection.IEEE Trans Nanobioscience. 2010 Mar;9(1):31-7. doi: 10.1109/TNB.2009.2035284. Epub 2009 Oct 30. IEEE Trans Nanobioscience. 2010. PMID: 19884101
-
Filter versus wrapper gene selection approaches in DNA microarray domains.Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007. Artif Intell Med. 2004. PMID: 15219288 Review.
-
Class-imbalanced classifiers for high-dimensional data.Brief Bioinform. 2013 Jan;14(1):13-26. doi: 10.1093/bib/bbs006. Epub 2012 Mar 9. Brief Bioinform. 2013. PMID: 22408190 Review.
Cited by
-
AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.BMC Bioinformatics. 2017 Mar 14;18(Suppl 3):50. doi: 10.1186/s12859-017-1468-4. BMC Bioinformatics. 2017. PMID: 28361689 Free PMC article.
-
An improved feature selection based on effective range for classification.ScientificWorldJournal. 2014 Feb 4;2014:972125. doi: 10.1155/2014/972125. eCollection 2014. ScientificWorldJournal. 2014. PMID: 24688449 Free PMC article.
-
Gene expression feature selection for prostate cancer diagnosis using a two-phase heuristic-deterministic search strategy.IET Syst Biol. 2018 Aug;12(4):162-169. doi: 10.1049/iet-syb.2017.0044. IET Syst Biol. 2018. PMID: 33451186 Free PMC article.
-
Abnormal Emotional Processing and Emotional Experience in Patients with Peripheral Facial Nerve Paralysis: An MEG Study.Brain Sci. 2020 Mar 4;10(3):147. doi: 10.3390/brainsci10030147. Brain Sci. 2020. PMID: 32143383 Free PMC article.
-
Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data.Entropy (Basel). 2021 Sep 20;23(9):1232. doi: 10.3390/e23091232. Entropy (Basel). 2021. PMID: 34573857 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous