Consistent Group Identification and Variable Selection in Regression with Correlated Predictors
- PMID: 23772171
- PMCID: PMC3678393
- DOI: 10.1080/15533174.2012.707849
Consistent Group Identification and Variable Selection in Regression with Correlated Predictors
Abstract
Statistical procedures for variable selection have become integral elements in any analysis. Successful procedures are characterized by high predictive accuracy, yielding interpretable models while retaining computational efficiency. Penalized methods that perform coefficient shrinkage have been shown to be successful in many cases. Models with correlated predictors are particularly challenging to tackle. We propose a penalization procedure that performs variable selection while clustering groups of predictors automatically. The oracle properties of this procedure including consistency in group identification are also studied. The proposed method compares favorably with existing selection approaches in both prediction accuracy and model discovery, while retaining its computational efficiency. Supplemental material are available online.
Keywords: Coefficient shrinkage; Correlation; Group identification; Oracle properties; Penalization; Supervised clustering; Variable selection.
Figures




References
-
- Bondell HD, Reich BJ. Simultaneous factor selection and collapsing of levels in ANOVA. Biometrics. 2009;65:169–177. - PubMed
-
- Breiman L. Better subset regression using the nonnegative garrote. Technometrics. 1995;37:373–384.
-
- Fan J, Li R. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Journal of the American Statistical Association. 2001;96:1348–1360.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources