Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 4;49(9):2189-2207.
doi: 10.1080/02664763.2021.1893285. eCollection 2022.

Spike-and-slab type variable selection in the Cox proportional hazards model for high-dimensional features

Affiliations

Spike-and-slab type variable selection in the Cox proportional hazards model for high-dimensional features

Ryan Wu et al. J Appl Stat. .

Abstract

In this paper, we develop a variable selection framework with the spike-and-slab prior distribution via the hazard function of the Cox model. Specifically, we consider the transformation of the score and information functions for the partial likelihood function evaluated at the given data from the parameter space into the space generated by the logarithm of the hazard ratio. Thereby, we reduce the nonlinear complexity of the estimation equation for the Cox model and allow the utilization of a wider variety of stable variable selection methods. Then, we use a stochastic variable search Gibbs sampling approach via the spike-and-slab prior distribution to obtain the sparsity structure of the covariates associated with the survival outcome. Additionally, we conduct numerical simulations to evaluate the finite-sample performance of our proposed method. Finally, we apply this novel framework on lung adenocarcinoma data to find important genes associated with decreased survival in subjects with the disease.

Keywords: 62J05; 62N02; Bayesian modeling; Markov chain Monte Carlo; latent indicator; lung adenocarcinoma; score function; stochastic variable search.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

Figures

Figure 1.
Figure 1.
The empirical average of incorrect nonzero covariates across all scenarios: Panels A, B, C, and D depict the graphical representations of EˆIN for all scenarios including the two low dimensional settings (n = 1000, p = 100 and n = 3000, p = 100), and two high dimensional settings (n = 1000, p = 4000 and n = 3000, p = 400), respectively. In each panel, we included the proposed method (solid line with open circle), Bayesian Cox method (dashed line with open triangle), Cox lasso method (dotted line with filled circle), and the method with minimized approximated information criterion (dot-dashed line with filled triangle).
Figure 2.
Figure 2.
Kaplan-Meier plots for the PBC data set: Panels A, B, C, and D contain the survival probabilities for the presence of ascites, presence of spiders, presence of hepatomegaly, and amount of serum cholesterol, respectively. For A, B, and C, the black lines indicate the presence of the covariate whereas the grey lines indicate the absence of the covariate. For D, the black line represents serum cholesterol levels higher than the median level, whereas the grey line represents subjects who had serum cholesterol levels lower than the median level. In each panel, p-values for the log-rank test are provided.
Figure 3.
Figure 3.
Kaplan-Meier plots for the lung adenocarcinoma data set: Panels A, B, C, and D contain the survival probabilities for the PRKACB, GAPDH, KLF6, and STX1A genes, respectively. The black lines indicate levels of the gene higher than the median level, whereas the grey lines levels of the gene lower than the median level. In each panel, p-values for the log-rank test are provided.

Similar articles

References

    1. Ahn M., Zhang H.H., and Lu W., Moment-based method for random effects selection in linear mixed models, Stat. Sin. 22 (2012), pp. 1539–1562. - PMC - PubMed
    1. Beer D.G., Kardia S.L., Huang C.C., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Lizyness M.L., Kuick R., Hayasaka S., Taylor J.M., Iannettoni M.D., Orringer M.B., and Hanash S., Gene-expression profiles predict survival of patients with lung adenocarcinoma, Nat. Med. 8 (2002), pp. 816–824. - PubMed
    1. Bland J.M. and Altman D.G., The logrank test, BMJ 328 (2004), p. 1073. - PMC - PubMed
    1. Chen Y., Gao Y., Tian Y., and Tian D.L., PRKACB is downregulated in non-small cell lung cancer and exogenous PRKACB inhibits proliferation and invasion of LTEP-A2 cells, Oncol. Lett. 5 (2013), pp. 1803–1808. - PMC - PubMed
    1. Cox D.R., Regressioin models and life-tables, J. R. Stat. Soc. Ser. B (Methodol.) 34 (1972), pp. 187–220.