Minimum redundancy feature selection from microarray gene expression data
- PMID: 15852500
- DOI: 10.1142/s0219720005001004
Minimum redundancy feature selection from microarray gene expression data
Abstract
How to selecting a small subset out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the top-ranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. We propose a minimum redundancy - maximum relevance (MRMR) feature selection framework. Genes selected via MRMR provide a more balanced coverage of the space and capture broader characteristics of phenotypes. They lead to significantly improved class predictions in extensive experiments on 6 gene expression data sets: NCI, Lymphoma, Lung, Child Leukemia, Leukemia, and Colon. Improvements are observed consistently among 4 classification methods: Naive Bayes, Linear discriminant analysis, Logistic regression, and Support vector machines. SUPPLIMENTARY: The top 60 MRMR genes for each of the datasets are listed in http://crd.lbl.gov/~cding/MRMR/. More information related to MRMR methods can be found at http://www.hpeng.net/.
Similar articles
-
Gene selection algorithm by combining reliefF and mRMR.BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S27. doi: 10.1186/1471-2164-9-S2-S27. BMC Genomics. 2008. PMID: 18831793 Free PMC article.
-
Hybrid Feature Selection Algorithm mRMR-ICA for Cancer Classification from Microarray Gene Expression Data.Comb Chem High Throughput Screen. 2018;21(6):420-430. doi: 10.2174/1386207321666180601074349. Comb Chem High Throughput Screen. 2018. PMID: 29852866
-
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data.Bioinformatics. 2005 Apr 15;21(8):1530-7. doi: 10.1093/bioinformatics/bti192. Epub 2004 Dec 7. Bioinformatics. 2005. PMID: 15585531
-
A review of feature extraction software for microarray gene expression data.Biomed Res Int. 2014;2014:213656. doi: 10.1155/2014/213656. Epub 2014 Aug 31. Biomed Res Int. 2014. PMID: 25250315 Free PMC article. Review.
-
Basic microarray analysis: grouping and feature reduction.Trends Biotechnol. 2001 May;19(5):189-93. doi: 10.1016/s0167-7799(01)01599-2. Trends Biotechnol. 2001. PMID: 11301132 Review.
Cited by
-
Using Data Independent Acquisition (DIA) to Model High-responding Peptides for Targeted Proteomics Experiments.Mol Cell Proteomics. 2015 Sep;14(9):2331-40. doi: 10.1074/mcp.M115.051300. Epub 2015 Jun 22. Mol Cell Proteomics. 2015. PMID: 26100116 Free PMC article.
-
microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.Front Microbiol. 2023 Nov 22;14:1264941. doi: 10.3389/fmicb.2023.1264941. eCollection 2023. Front Microbiol. 2023. PMID: 38075911 Free PMC article.
-
Using Wearable Sensors and Machine Learning to Automatically Detect Freezing of Gait during a FOG-Provoking Test.Sensors (Basel). 2020 Aug 10;20(16):4474. doi: 10.3390/s20164474. Sensors (Basel). 2020. PMID: 32785163 Free PMC article.
-
Using prior knowledge from cellular pathways and molecular networks for diagnostic specimen classification.Brief Bioinform. 2016 May;17(3):440-52. doi: 10.1093/bib/bbv044. Epub 2015 Jul 2. Brief Bioinform. 2016. PMID: 26141830 Free PMC article.
-
Multimodal data and machine learning for surgery outcome prediction in complicated cases of mesial temporal lobe epilepsy.Comput Biol Med. 2015 Sep;64:67-78. doi: 10.1016/j.compbiomed.2015.06.008. Epub 2015 Jun 19. Comput Biol Med. 2015. PMID: 26149291 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources