Variable selection for multivariate failure time data
- PMID: 19458784
- PMCID: PMC2674767
- DOI: 10.1093/biomet/92.2.303
Variable selection for multivariate failure time data
Abstract
In this paper, we proposed a penalised pseudo-partial likelihood method for variable selection with multivariate failure time data with a growing number of regression coefficients. Under certain regularity conditions, we show the consistency and asymptotic normality of the penalised likelihood estimators. We further demonstrate that, for certain penalty functions with proper choices of regularisation parameters, the resulting estimator can correctly identify the true model, as if it were known in advance. Based on a simple approximation of the penalty function, the proposed method can be easily carried out with the Newton-Raphson algorithm. We conduct extensive Monte Carlo simulation studies to assess the finite sample performance of the proposed procedures. We illustrate the proposed method by analysing a dataset from the Framingham Heart Study.
References
-
- Akaike H. Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika. 1973;60:255–65.
-
- Andersen PK, Gill RD. Cox’s regression model for counting processes: A large sample study. Ann Statist. 1982;10:1100–20.
-
- Breiman L. Heuristics of instability and stabilization in model selection. Ann Statist. 1996;24:2350–83.
-
- Bunea F, McKeague IW. Covariate selection for semiparametric hazard function regression models. J Mult Anal. 2004 In press.
-
- Cai J. Hypothesis testing of hazard ratio parameters in marginal models for multivariate failure time data. Lifetime Data Anal. 1999;5:39–53. - PubMed