Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022;31(4):1091-1103.
doi: 10.1080/10618600.2022.2039161. Epub 2022 Mar 31.

A scalable hierarchical lasso for gene-environment interactions

Affiliations

A scalable hierarchical lasso for gene-environment interactions

Natalia Zemlianskaia et al. J Comput Graph Stat. 2022.

Abstract

We describe a regularized regression model for the selection of gene-environment (G×E) interactions. The model focuses on a single environmental exposure and induces a main-effect-before-interaction hierarchical structure. We propose an efficient fitting algorithm and screening rules that can discard large numbers of irrelevant predictors with high accuracy. We present simulation results showing that the model outperforms existing joint selection methods for (G×E) interactions in terms of selection performance, scalability and speed, and provide a real data application. Our implementation is available in the gesso R package.

Keywords: hierarchical variable selection; joint analysis; screening rules.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
(a) Dynamic SAFE regions, (b) dynamic GAP SAFE regions.
Fig. 2
Fig. 2
Time comparison of proposed algorithms: mean runtime over 100 replicates on the y-axis, dual gap tolerance on the x-axis.
Fig. 3
Fig. 3
Log working set size (log10 (WS)) for all lambda pairs.
Fig. 4
Fig. 4
Model performance (top row: AUC for G×E selection, bottom row: precision for G×E selection) as a function of the number of interactions discovered. p=2500, n=100, pG=10, pG×E =5.
Fig. 5
Fig. 5
Geometric solution to the problem (19). Optimal δj maximizes the range of possible x values.

References

    1. Ayers KL, Cordell HJ (2010). ”SNP selection in genome-wide and candidate gene studies via penalized logistic regression”, Genet Epidemiol, 34(8):879–891. - PMC - PubMed
    1. Bhatnagar S, Lovato A, Yang Y, Greenwood C (2018). ”Sparse Additive Interaction Learning”, bioRxiv 445304.
    1. Bien J, Taylor J, Tibshirani R (2013). ”A lasso for hierarchical interactions”, Ann. Statist 41, no. 3, 1111–1141. - PMC - PubMed
    1. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011). ”Distributed optimization and statistical learning via the alternating direction method of multipliers”, Foundations and Trends in Machine Learning, 3(1):1–122.
    1. Bonnefoy A, Emiya V, Ralaivola L, and Gribonval R (2014). ”A dynamic screening principle for the lasso”, In EUSIPCO.

LinkOut - more resources