Penalized methods for bi-level variable selection

Patrick Breheny¹, Jian Huang

Affiliations

PMID: 20640242
PMCID: PMC2904563
DOI: 10.4310/sii.2009.v2.n3.a10

Penalized methods for bi-level variable selection

Patrick Breheny et al. Stat Interface. 2009.

. 2009 Jul 1;2(3):369-380.

doi: 10.4310/sii.2009.v2.n3.a10.

Authors

Patrick Breheny¹, Jian Huang

Affiliation

¹ Department of Biostatistics, University of Kentucky, Lexington, Kentucky 40506, USA.

PMID: 20640242
PMCID: PMC2904563
DOI: 10.4310/sii.2009.v2.n3.a10

Abstract

In many applications, covariates possess a grouping structure that can be incorporated into the analysis to select important groups as well as important members of those groups. This work focuses on the incorporation of grouping structure into penalized regression. We investigate the previously proposed group lasso and group bridge penalties as well as a novel method, group MCP, introducing a framework and conducting simulation studies that shed light on the behavior of these methods. To fit these models, we use the idea of a locally approximated coordinate descent to develop algorithms which are fast and stable even when the number of features is much larger than the sample size. Finally, these methods are applied to a genetic association study of age-related macular degeneration.

PubMed Disclaimer

Figures

Figure 1. Derivatives of penalty functions referenced in this paper. Left: Ridge (gray line), lasso (dashed line) and bridge (γ = *1/2*, solid black line) penalties. Right: MCP (solid black line) and SCAD (dashed line) penalties

Figure 2. Penalties applied to a two-covariate group by the group lasso, group bridge, and group MCP methods. Note that where the penalty comes to a point or edge, there is the possibility that the solution will take on a sparse value; all penalties come to a point at 0, encouraging group-level sparsity, but only group bridge and group MCP allow for bi-level selection

Figure 3. Coefficient paths from 0 to λ_max for group lasso, group bridge, and group MCP for a simulated data set featuring two groups, each with three covariates. In the underlying model, the solid line group has two covariates equal to 1 and the other equal to 0; the dotted line group has two coefficients equal to 0 and the other equal to −1

Figure 4. Model error for each method after selecting λ with BIC using one of two estimators for the effective number of model parameters. Solid line: Estimator (22). Dashed line: Using number of nonzero elements of β

Figure 5. Model error simulation results. In each panel, the number of nonzero groups is indicated in the strip at the top. The x-axis represents the number of nonzero elements per group. At each tick mark, 500 data sets were generated. A lowess curve has been fit to the points and plotted

See this image and copyright information in PMC

References

1. Breiman L. Heuristics of instability and stabilization in model selection. The Annals of Statistics. 1996;24(6):2350–2383. MR1425957.
1. Donoho DL, Johnstone IM. Ideal spatial adaptation by wavelet shrinkage. Biometrika. 1994;81:425–455. MR1311089.
1. Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. The Annals of Statistics. 2004;32(2):407–499. MR2060166.
1. Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association. 2001;96(456):1348–1360. MR1946581.
1. Frank IE, Friedman JH. A statistical view of some chemometrics regression tools (Disc: P136-148) Technometrics. 1993;35:109–135.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Penalized methods for bi-level variable selection

Affiliation

Penalized methods for bi-level variable selection

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous