A LASSO FOR HIERARCHICAL INTERACTIONS

Jacob Bien¹, Jonathan Taylor¹, Robert Tibshirani¹

Affiliations

PMID: 26257447
PMCID: PMC4527358
DOI: 10.1214/13-AOS1096

A LASSO FOR HIERARCHICAL INTERACTIONS

Jacob Bien et al. Ann Stat. 2013 Jun.

. 2013 Jun;41(3):1111-1141.

doi: 10.1214/13-AOS1096.

Authors

Jacob Bien¹, Jonathan Taylor¹, Robert Tibshirani¹

Affiliation

¹ Cornell University, Stanford University and Stanford University.

PMID: 26257447
PMCID: PMC4527358
DOI: 10.1214/13-AOS1096

Abstract

We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise characterization of the effect of this hierarchy constraint, prove that hierarchy holds with probability one and derive an unbiased estimate for the degrees of freedom of our estimator. A bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint. We distinguish between parameter sparsity-the number of nonzero coefficients-and practical sparsity-the number of raw variables one must measure to make a new prediction. Hierarchy focuses on the latter, which is more closely tied to important data collection concerns such as cost, time and effort. We develop an algorithm, available in the R package hierNet, and perform an empirical study of our method.

Keywords: Regularized regression; convexity; hierarchical sparsity; interactions; lasso.

PubMed Disclaimer

Figures

**Fig. 1**
Olive oil data: (Top left) Parameter sparsity is the number of nonzero coefficients while practical sparsity is the number of *measured* variables in the model. Results from all 100 random train-test splits are shown as points; lines show the average performance over all 100 runs. (Top right) Misclassification error on test set versus practical sparsity. (Bottom) Wheel plots showing the sparsity pattern at 6 values of λ for the strong hierarchical lasso. Filled nodes correspond to nonzero main effects, and edges correspond to nonzero interactions.

**Fig. 2**
Numerical evaluation of how well ${\hat{d f}}_{λ}$ estimates *df_λ*. Monte Carlo estimates of $E [{\hat{d f}}_{λ}]$ (y-axis) versus Monte Carlo estimates of *df_λ* (x-axis) for a sequence of λ values (circular) are shown. One-standard-error bars are drawn and are hardly visible. Our bound on the unbiased estimate is plotted with diamonds.

**Fig. 3**
Prediction error: Dashed line shows Bayes error (i.e., σ²), and the base rate refers to the prediction error of ȳ_train. Green, red and blue colors indicate hierarchy, all-pairs, and main effect only, respectively; solid and striped indicate lasso and forward stepwise, respectively.

**Fig. 4**
Plots show the ability of various methods to correctly recover the nonzero interactions. This is the sensitivity (i.e., proportion of Θ*_jk* ≠ 0 for which Θ●*_jk* ≠ 0) and specificity (i.e., proportion of Θ*_jk* ≠ 0 for which Θ●*_jk* ≠ 0) corresponding to the lowest prediction error model of each method.

**Fig. 5**
HIV drug data: Test-set RMSE versus practical sparsity (i.e., number of measured variables required for prediction) for six different drugs. For each method, the data from all 20 runs are displayed in faint colors; the thick lines are averages over these runs.

See this image and copyright information in PMC

References

1. Agresti A. Categorical Data Analysis. 2. Wiley-Interscience; New York: 2002.
1. Bach F. Optimization with sparsity-inducing penalties. Foundations and Trends in Machine Learning. 2011;4:1–106.
1. Bach F, Jenatton R, Mairal J, Obozinski G. Structured sparsity through convex optimization. Statist Sci. 2012;27:450–468.
1. Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2:183–202.
1. Bickel P, Ritov Y, Tsybakov A. Inst Math Stat Collect. Vol. 6. Inst. Math. Statist; Beachwood, OH: 2010. Hierarchical selection of variables in sparse high-dimensional regression. Borrowing Strength: Theory Powering Applications—A Festschrift for Lawrence D. Brown; pp. 56–69.

Grants and funding

R01 EB001988/EB/NIBIB NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A LASSO FOR HIERARCHICAL INTERACTIONS

Affiliation

A LASSO FOR HIERARCHICAL INTERACTIONS

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources