Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun;66(2):474-84.
doi: 10.1111/j.1541-0420.2009.01296.x. Epub 2009 Jul 23.

Incorporating predictor network in penalized regression with application to microarray data

Affiliations

Incorporating predictor network in penalized regression with application to microarray data

Wei Pan et al. Biometrics. 2010 Jun.

Abstract

We consider penalized linear regression, especially for "large p, small n" problems, for which the relationships among predictors are described a priori by a network. A class of motivating examples includes modeling a phenotype through gene expression profiles while accounting for coordinated functioning of genes in the form of biological pathways or networks. To incorporate the prior knowledge of the similar effect sizes of neighboring predictors in a network, we propose a grouped penalty based on the L(gamma)-norm that smoothes the regression coefficients of the predictors over the network. The main feature of the proposed method is its ability to automatically realize grouped variable selection and exploit grouping effects. We also discuss effects of the choices of the gamma and some weights inside the L(gamma)-norm. Simulation studies demonstrate the superior finite-sample performance of the proposed method as compared to Lasso, elastic net, and a recently proposed network-based method. The new method performs best in variable selection across all simulation set-ups considered. For illustration, the method is applied to a microarray dataset to predict survival times for some glioblastoma patients using a gene expression dataset and a gene network compiled from some Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Constraint regions of (β̃1, β̃2) yielding β̂1 = β̂2 = 0 for various γ and λ = 1. This figure appears in color in the electronic version of this article.
Figure 2
Figure 2
PMSEs (± SE) versus tuning parameter s based on ten-fold CV for Lasso for the two sets of the glioblastoma data.
Figure 3
Figure 3
Solution paths or PMSE versus tuning parameter s based on tuning data for Lasso and our new method based on a linear model for the first set of the glioblastoma data. This figure appears in color in the electronic version of this article.

References

    1. Allen C, Vongpunsawad S, Nakamura T, James CD, Schroeder M, Cattaneo R, Giannini C, Krempski J, Peng KW, Goble JM, Uhm JH, Russell SJ, Galanis E. Retargeted oncolytic measles strains entering via the EGFRvIII receptor maintain significant antitumor activity against gliomas with increased tumor specificity. Cancer Res. 2006;66:11840–11850. - PubMed
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics. 2000;25:25–29. - PMC - PubMed
    1. Bondell HD, Reich BJ. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics 2008 - PMC - PubMed
    1. Choe G, Horvath S, Cloughesy TF, Crosby K, Seligson D, Palotie A, Inge L, Smith BL, Sawyers CL, Mischel PS. Analysis of the phosphatidylinositol 3′-kinase signaling pathway in glioblastoma patients in vivo. Cancer Res. 2003;63:2742–2746. - PubMed
    1. Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. Annals of Statistics. 2004;32:407–499.

Publication types