Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar;25(2):173-187.
doi: 10.1007/s11222-013-9424-2.

Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors

Affiliations

Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors

Patrick Breheny et al. Stat Comput. 2015 Mar.

Abstract

Penalized regression is an attractive framework for variable selection problems. Often, variables possess a grouping structure, and the relevant selection problem is that of selecting groups, not individual variables. The group lasso has been proposed as a way of extending the ideas of the lasso to the problem of group selection. Nonconvex penalties such as SCAD and MCP have been proposed and shown to have several advantages over the lasso; these penalties may also be extended to the group selection problem, giving rise to group SCAD and group MCP methods. Here, we describe algorithms for fitting these models stably and efficiently. In addition, we present simulation results and real data examples comparing and contrasting the statistical properties of these methods.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The impact of orthonormalization on the solution to the group lasso. Contour lines for the likelihood (least squares) surface are drawn, centered around the OLS solution, as well as the solution path for the group lasso as λ goes from 0 to ∞. Left: Non-orthonormal X. Right: Orthonormal X.
Figure 2
Figure 2
Lasso, SCAD, and MCP penalty functions, derivatives, and univariate solutions. The panel on the left plots the penalties themselves, the middle panel plots the first derivative of the penalty, and the right panel plots the univariate solutions as a function of the ordinary least squares estimate. The light gray line in the rightmost plot is the identity line. Note that none of the penalties are differentiable at βj = 0.
Figure 3
Figure 3
Representative solution paths for the group lasso, group MCP, and group SCAD methods. In the generating model, groups A and B have nonzero coefficients and while those belonging to group C are zero.
Figure 4
Figure 4
The impact of increasing coefficient magnitude on group regularization methods. Model size is given in terms of number of groups (i.e., the number of variables in the model is four times the amount shown). The faint gray line on the left is the theoretically optimal RMSE than can be achieved in this setting. The faint gray line on the right is the true model size.
Figure 5
Figure 5
Estimated relationship between probe set 1372928 at and TRIM32 estimated by group lasso, group MCP, and group SCAD. Estimates are superimposed on top of a scatterplot and restricted to pass through the mean expression for each probe set.

References

    1. Bakin S. Ph.D. thesis. Australian National University; 1999. Adaptive regression and model selection in data mining problems.
    1. Bertsekas D. Nonlinear Programming. 2nd ed. Athena Scientific; 1999.
    1. Breheny P, Huang J. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics. 2011;5:232–253. - PMC - PubMed
    1. Chiang A, Beck J, Yen H, Tayeh M, Scheetz T, Swiderski R, Nishimura D, Braun T, Kim K, Huang J, et al. Homozygosity mapping with snp arrays identifies trim32, an e3 ubiquitin ligase, as a bardet–biedl syndrome gene (bbs11). Proceedings of the National Academy of Sciences. 2006;103:6287–6292. - PMC - PubMed
    1. Donoho D, Johnstone J. Ideal spatial adaptation by wavelet shrinkage. Biometrika. 1994;81:425–455.

LinkOut - more resources