Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 6;9(1):e84483.
doi: 10.1371/journal.pone.0084483. eCollection 2014.

Boosting the concordance index for survival data--a unified framework to derive and evaluate biomarker combinations

Affiliations

Boosting the concordance index for survival data--a unified framework to derive and evaluate biomarker combinations

Andreas Mayr et al. PLoS One. .

Abstract

The development of molecular signatures for the prediction of time-to-event outcomes is a methodologically challenging task in bioinformatics and biostatistics. Although there are numerous approaches for the derivation of marker combinations and their evaluation, the underlying methodology often suffers from the problem that different optimization criteria are mixed during the feature selection, estimation and evaluation steps. This might result in marker combinations that are suboptimal regarding the evaluation criterion of interest. To address this issue, we propose a unified framework to derive and evaluate biomarker combinations. Our approach is based on the concordance index for time-to-event data, which is a non-parametric measure to quantify the discriminatory power of a prediction rule. Specifically, we propose a gradient boosting algorithm that results in linear biomarker combinations that are optimal with respect to a smoothed version of the concordance index. We investigate the performance of our algorithm in a large-scale simulation study and in two molecular data sets for the prediction of survival in breast cancer patients. Our numerical results show that the new approach is not only methodologically sound but can also lead to a higher discriminatory power than traditional approaches for the derivation of gene signatures.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Coefficient estimates for pre-selected markers obtained from 100 simulation runs.
The marker combinations were optimized via gradient boosting based on training samples of size formula image (left) and formula image (right). Boxplots represent the empirical distribution of the resulting coefficients. Only markers formula image to formula image had an actual effect on the survival time.
Figure 2
Figure 2. Simulation results for the discriminatory power obtained via the proposed -index boosting approach and competing Cox-based estimation schemes.
The marker combinations were optimized via the different approaches based on training samples of size formula image (left) and formula image (right). Boxplots represent the empirical distribution of the resulting formula image on corresponding test samples. The dotted line corresponds to the discriminatory power resulting from the true combination of predictors with known coefficients.
Figure 3
Figure 3. Comparing the discriminatory power of biomarker combinations to predict the time to distant metastases resulting from the proposed -index boosting approach with competing estimation schemes.
The plot on the left refers to the Desmedt et al. data, whereas the plot on the right presents results from the van de Vijver et al. data. All biomarker combinations were optimized via the corresponding algorithms based on the same 100 learning samples. Boxplots represent the empirical distribution of the resulting formula image on corresponding test samples. The dotted line corresponds to the median formula image-index resulting from the new approach.

Similar articles

Cited by

References

    1. Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, et al. (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independentvalidation series. Clinical Cancer Research 13: 3207–3214. - PubMed
    1. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AAM, et al. (2002) A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine 347: 1999–2009. - PubMed
    1. Kok M, Linn SC, Laar RKV, Jansen MPHM, van den Berg TM, et al. (2009) Comparison of gene expression profiles predicting progression in breast cancer patients treated with tamoxifen. Breast Cancer Research and Treatment 13: 275–283. - PubMed
    1. Li H, Gui J (2004) Partial Cox regression analysis for high-dimensional microarray gene expression data. Bioinformatics 20: 208–215. - PubMed
    1. Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, et al... (2004) Gene expression signature of fibroblast serum response predicts human cancer progression: Similarities between tumors and wounds. PLoS Biology 2. - PMC - PubMed

Publication types