Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 10;38(16):3918-3926.
doi: 10.1093/bioinformatics/btac416.

Variational Bayes for high-dimensional proportional hazards models with applications within gene expression

Affiliations

Variational Bayes for high-dimensional proportional hazards models with applications within gene expression

Michael Komodromos et al. Bioinformatics. .

Abstract

Motivation: Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.

Results: We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.

Availability and implementation: our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Ovarian cancer dataset model convergence diagnostics for λ = 1
Fig. 2.
Fig. 2.
(A) Kaplan–Meier curves for patients in low- and high-risk groups. (B) Comparison of patients in the low- and high-risk groups (ordered by η^)—within each cell the (variational) posterior probability patient in row i is at greater risk than patient in column j is computed. Samples are taken from the second validation fold and the fit with λ=2.5 is used

Similar articles

Cited by

References

    1. Antoniadis A. et al. (2010) The dantzig selector in cox’s proportional hazards model. Scand. J. Stat., 37, 531–552.
    1. Bai R. et al. (2021) Spike-and-Slab Meets LASSO: A Review of the Spike-and-Slab LASSO. arXiv preprint arXiv:2010.06451, May 2021.
    1. Banerjee S. et al. (2021) Bayesian Inference in High-Dimensional Models. arXiv preprint arXiv:2101.04491, Jan 2021.
    1. Bhadra A. et al. (2019) Lasso meets horseshoe: a survey. Stat. Sci., 34, 405–427.
    1. Blei D.M., Lafferty J.D. (2007) A correlated topic model of science. Ann. Appl. Stat., 1, 17–35.

Publication types