Predicting survival from microarray data--a comparative study
- PMID: 17553857
- DOI: 10.1093/bioinformatics/btm305
Predicting survival from microarray data--a comparative study
Abstract
Motivation: Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Cox's proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression and the lasso.
Results: Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance.
Availability: Matlab and R code for the prediction methods are available at http://www.med.uio.no/imb/stat/bmms/software/microsurv/.
Similar articles
-
Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data.Bioinformatics. 2005 Jul 1;21(13):3001-8. doi: 10.1093/bioinformatics/bti422. Epub 2005 Apr 6. Bioinformatics. 2005. PMID: 15814556
-
Partial least squares dimension reduction for microarray gene expression data with a censored response.Math Biosci. 2005 Jan;193(1):119-37. doi: 10.1016/j.mbs.2004.10.007. Epub 2005 Jan 22. Math Biosci. 2005. PMID: 15681279
-
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14. Bioinformatics. 2006. PMID: 16844704
-
Cancer biomarkers: a systems approach.Nat Biotechnol. 2006 Aug;24(8):905-8. doi: 10.1038/nbt0806-905. Nat Biotechnol. 2006. PMID: 16900126 Review. No abstract available.
-
Post-Estimation Shrinkage in Full and Selected Linear Regression Models in Low-Dimensional Data Revisited.Biom J. 2024 Oct;66(7):e202300368. doi: 10.1002/bimj.202300368. Biom J. 2024. PMID: 39330705 Review.
Cited by
-
Gene Selection using a High-Dimensional Regression Model with Microarrays in Cancer Prognostic Studies.Cancer Inform. 2012;11:29-39. doi: 10.4137/CIN.S9048. Epub 2012 Feb 27. Cancer Inform. 2012. PMID: 22442625 Free PMC article.
-
Application of modified regression techniques to a quantitative assessment for the motor signs of Parkinson's disease.IEEE Trans Neural Syst Rehabil Eng. 2009 Dec;17(6):568-75. doi: 10.1109/TNSRE.2009.2034461. Epub 2009 Oct 30. IEEE Trans Neural Syst Rehabil Eng. 2009. PMID: 19884100 Free PMC article.
-
Bayesian hierarchical lasso Cox model: A 9-gene prognostic signature for overall survival in gastric cancer in an Asian population.PLoS One. 2022 Apr 14;17(4):e0266805. doi: 10.1371/journal.pone.0266805. eCollection 2022. PLoS One. 2022. PMID: 35421138 Free PMC article.
-
Computational Analysis of High-Dimensional DNA Methylation Data for Cancer Prognosis.J Comput Biol. 2022 Aug;29(8):769-781. doi: 10.1089/cmb.2022.0002. Epub 2022 Jun 6. J Comput Biol. 2022. PMID: 35671506 Free PMC article. Review.
-
Flexible boosting of accelerated failure time models.BMC Bioinformatics. 2008 Jun 6;9:269. doi: 10.1186/1471-2105-9-269. BMC Bioinformatics. 2008. PMID: 18538026 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources