Differential gene expression detection and sample classification using penalized linear regression models
- PMID: 16352654
- DOI: 10.1093/bioinformatics/bti827
Differential gene expression detection and sample classification using penalized linear regression models
Erratum in
- Bioinformatics. 2006 Apr 15;22(8):1029
Abstract
Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p >> n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.
Similar articles
-
Improved centroids estimation for the nearest shrunken centroid classifier.Bioinformatics. 2007 Apr 15;23(8):972-9. doi: 10.1093/bioinformatics/btm046. Epub 2007 Mar 24. Bioinformatics. 2007. PMID: 17384429
-
Independent component analysis-based penalized discriminant method for tumor classification using gene expression data.Bioinformatics. 2006 Aug 1;22(15):1855-62. doi: 10.1093/bioinformatics/btl190. Epub 2006 May 18. Bioinformatics. 2006. PMID: 16709589
-
Cancer classification and prediction using logistic regression with Bayesian gene selection.J Biomed Inform. 2004 Aug;37(4):249-59. doi: 10.1016/j.jbi.2004.07.009. J Biomed Inform. 2004. PMID: 15465478
-
How does gene expression clustering work?Nat Biotechnol. 2005 Dec;23(12):1499-501. doi: 10.1038/nbt1205-1499. Nat Biotechnol. 2005. PMID: 16333293 Review.
-
Key aspects of analyzing microarray gene-expression data.Pharmacogenomics. 2007 May;8(5):473-82. doi: 10.2217/14622416.8.5.473. Pharmacogenomics. 2007. PMID: 17465711 Review.
Cited by
-
Bias-corrected diagonal discriminant rules for high-dimensional classification.Biometrics. 2010 Dec;66(4):1096-106. doi: 10.1111/j.1541-0420.2010.01395.x. Biometrics. 2010. PMID: 20222939 Free PMC article.
-
Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique.Int J Mol Sci. 2018 Oct 30;19(11):3398. doi: 10.3390/ijms19113398. Int J Mol Sci. 2018. PMID: 30380746 Free PMC article.
-
Characterizing Human Cell Types and Tissue Origin Using the Benford Law.Cells. 2019 Aug 29;8(9):1004. doi: 10.3390/cells8091004. Cells. 2019. PMID: 31470662 Free PMC article.
-
Identification of significant features in DNA microarray data.Wiley Interdiscip Rev Comput Stat. 2013 Jul;5(4):10.1002/wics.1260. doi: 10.1002/wics.1260. Wiley Interdiscip Rev Comput Stat. 2013. PMID: 24244802 Free PMC article.
-
EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits.Bioinformatics. 2018 Jun 15;34(12):1996-2003. doi: 10.1093/bioinformatics/bty042. Bioinformatics. 2018. PMID: 29385408 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources