Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 1;37(4):514-521.
doi: 10.1093/bioinformatics/btaa776.

Incorporating prior knowledge into regularized regression

Affiliations

Incorporating prior knowledge into regularized regression

Chubing Zeng et al. Bioinformatics. .

Abstract

Motivation: Associated with genomic features like gene expression, methylation and genotypes, used in statistical modeling of health outcomes, there is a rich set of meta-features like functional annotations, pathway information and knowledge from previous studies, that can be used post hoc to facilitate the interpretation of a model. However, using this meta-feature information a priori rather than post hoc can yield improved prediction performance as well as enhanced model interpretation.

Results: We propose a new penalized regression approach that allows a priori integration of external meta-features. The method extends LASSO regression by incorporating individualized penalty parameters for each regression coefficient. The penalty parameters are, in turn, modeled as a log-linear function of the meta-features and are estimated from the data using an approximate empirical Bayes approach. Optimization of the marginal likelihood on which the empirical Bayes estimation is performed using a fast and stable majorization-minimization procedure. Through simulations, we show that the proposed regression with individualized penalties can outperform the standard LASSO in terms of both parameters estimation and prediction performance when the external data is informative. We further demonstrate our approach with applications to gene expression studies of bone density and breast cancer.

Availability and implementation: The methods have been implemented in the R package xtune freely available for download from https://cran.r-project.org/web/packages/xtune/index.html.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Simulation results. Subplot (a): SNR = 1, 2, 3, with n =200, p =1000, q =10, α0=3,δ=0.5,ρ=0.2. Subplot (b): p =500, 1000, 3000, with n =200, SNR = 2, q =10, α0=3,δ=0.5,ρ=0.2. Subplot (c): q =10, 30, 50, with n =200, p =1000, SNR = 2, α0=3,δ=0.5,ρ=0.2. Subplot (d): regression coefficients sparsity δ = 0.3,0.5,0.7, with n =200, p =1000, SNR = 2, q =10, α0=3,ρ=0.2. Subplot (e): overall penalty magnitude α0 = 1, 3, 5, n =200, p =1000, SNR = 2, q =10, δ=0.5,ρ=0.2. Subplot (f): ρ=0.3,0.6,0.9, with n =200, p =1000, SNR = 2, q =10, α0=3,δ=0.5
Fig. 2.
Fig. 2.
Compare test R2 and number of selected covariates of adaptive LASSO, standard LASSO and xtune LASSO using bone density data. The mean test R2 is 0.27 for adaptive LASSO; 0.38 for the standard LASSO, and 0.43 for xtune LASSO. The mean number of selected covariates is 10 for adaptive LASSO, 43 for standard LASSO and 16 for the xtune LASSO
Fig. 3.
Fig. 3.
ROC curves for adaptive LASSO, standard LASSO and xtune LASSO applied to the breast cancer dataset

References

    1. Ashburner M. et al. (2000) Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet., 25, 25–29. - PMC - PubMed
    1. Bergersen L.C. et al. (2011) Weighted lasso with data integration. Stat. Appl. Genet. Mol. Biol., 10. - PubMed
    1. Bhattacharya A. et al. (2015) Dirichlet-Laplace priors for optimal shrinkage. J. Am. Stat. Assoc., 110, 1479-1490. - PMC - PubMed
    1. Boulesteix A.-L. et al. (2017) IPF-LASSO: integrative-penalized regression with penalty factors for prediction based on multi-omics data. Comput. Math. Methods Med., 2017, 1–14. - PMC - PubMed
    1. Boyd S., Vandenberghe L. (2004). Convex Optimization. Cambridge University Press, New York, NY, USA.

Publication types