A scalable hierarchical lasso for gene-environment interactions

Natalia Zemlianskaia¹, W James Gauderman¹, Juan Pablo Lewinger¹

Affiliations

PMID: 36793591
PMCID: PMC9928188
DOI: 10.1080/10618600.2022.2039161

A scalable hierarchical lasso for gene-environment interactions

Natalia Zemlianskaia et al. J Comput Graph Stat. 2022.

. 2022;31(4):1091-1103.

doi: 10.1080/10618600.2022.2039161. Epub 2022 Mar 31.

Authors

Natalia Zemlianskaia¹, W James Gauderman¹, Juan Pablo Lewinger¹

Affiliation

¹ Division of Biostatistics, Department of Preventive Medicine, University or Southern California.

PMID: 36793591
PMCID: PMC9928188
DOI: 10.1080/10618600.2022.2039161

Abstract

We describe a regularized regression model for the selection of gene-environment (G×E) interactions. The model focuses on a single environmental exposure and induces a main-effect-before-interaction hierarchical structure. We propose an efficient fitting algorithm and screening rules that can discard large numbers of irrelevant predictors with high accuracy. We present simulation results showing that the model outperforms existing joint selection methods for (G×E) interactions in terms of selection performance, scalability and speed, and provide a real data application. Our implementation is available in the gesso R package.

Keywords: hierarchical variable selection; joint analysis; screening rules.

PubMed Disclaimer

Figures

**Fig. 1**
(a) Dynamic SAFE regions, (b) dynamic GAP SAFE regions.

**Fig. 2**
Time comparison of proposed algorithms: mean runtime over 100 replicates on the y-axis, dual gap tolerance on the x-axis.

**Fig. 3**
Log working set size (log₁₀ (WS)) for all lambda pairs.

**Fig. 4**
Model performance (top row: AUC for G×E selection, bottom row: precision for G×E selection) as a function of the number of interactions discovered. p=2500, n=100, *p_G*=10, p_G×E =5.

**Fig. 5**
Geometric solution to the problem (19). Optimal *δ_j* maximizes the range of possible x values.

See this image and copyright information in PMC

References

1. Ayers KL, Cordell HJ (2010). ”SNP selection in genome-wide and candidate gene studies via penalized logistic regression”, Genet Epidemiol, 34(8):879–891. - PMC - PubMed
1. Bhatnagar S, Lovato A, Yang Y, Greenwood C (2018). ”Sparse Additive Interaction Learning”, bioRxiv 445304.
1. Bien J, Taylor J, Tibshirani R (2013). ”A lasso for hierarchical interactions”, Ann. Statist 41, no. 3, 1111–1141. - PMC - PubMed
1. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011). ”Distributed optimization and statistical learning via the alternating direction method of multipliers”, Foundations and Trends in Machine Learning, 3(1):1–122.
1. Bonnefoy A, Emiya V, Ralaivola L, and Gribonval R (2014). ”A dynamic screening principle for the lasso”, In EUSIPCO.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- figshare - Access datasets and other research materials.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A scalable hierarchical lasso for gene-environment interactions

Affiliation

A scalable hierarchical lasso for gene-environment interactions

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources