Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010 May;2(5):855-62.
doi: 10.4155/bio.10.35.

Derivation of cancer diagnostic and prognostic signatures from gene expression data

Affiliations
Review

Derivation of cancer diagnostic and prognostic signatures from gene expression data

Steve Goodison et al. Bioanalysis. 2010 May.

Abstract

The ability to compare genome-wide expression profiles in human tissue samples has the potential to add an invaluable molecular pathology aspect to the detection and evaluation of multiple diseases. Applications include initial diagnosis, evaluation of disease subtype, monitoring of response to therapy and the prediction of disease recurrence. The derivation of molecular signatures that can predict tumor recurrence in breast cancer has been a particularly intense area of investigation and a number of studies have shown that molecular signatures can outperform currently used clinicopathologic factors in predicting relapse in this disease. However, many of these predictive models have been derived using relatively simple computational algorithms and whether these models are at a stage of development worthy of large-cohort clinical trial validation is currently a subject of debate. In this review, we focus on the derivation of optimal molecular signatures from high-dimensional data and discuss some of the expected future developments in the field.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Computational experimental procedure used to derive predictive molecular signatures for prostate cancer recurrence using gene expression data
The experimental protocol consists of inner and outer loops. In the inner loop, LOOCV is performed to estimate the optimal classifier parameters based on the training data provided by the outer loop and in the outer loop, a held-out sample is classified using the best parameters from the inner loop. The experiment is repeated until each sample has been tested. The held-out testing sample is not involved in any stage of the training process. LOOCV: Leave-one-out cross validation.

References

    1. Mook S, Schmidt MK, Rutgers EJ, et al. Calibration and discriminatory accuracy of prognosis calculation for breast cancer with the online Adjuvant! program: a hospital-based retrospective cohort study. Lancet Oncol. 2009;10(11):1070–1076. - PubMed
    1. Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large β-cell lymphoma identified by gene expression profiling. Nature. 2000;403(6769):503–511. - PubMed
    1. Alizadeh AA, Ross DT, Perou CM, Van de Rijn M. Towards a novel classification of human malignancies based on gene expression patterns. J. Pathol. 2001;195(1):41–52. - PubMed
    1. Sotiriou C, Neo SY, McShane LM, et al. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc. Natl Acad. Sci. USA. 2003;100(18):10393–10398. - PMC - PubMed
    1. Sorlie T, Tibshirani R, Parker J, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl Acad. Sci. USA. 2003;100(14):8418–8423. - PMC - PubMed

Publication types