. 2004 Mar 11:5:26.

doi: 10.1186/1471-2105-5-26.

Computational protein biomarker prediction: a case study for prostate cancer

Michael Wagner¹, Dayanand N Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O John Semmes, George L Wright Jr

Affiliations

Affiliation

¹ Cincinnati Children's Hospital Research Foundation and Department of Biomedical Engineering, University of Cincinnati, Cincinnati, OH 45229, USA. mwagner@cchmc.org

PMID: 15113409
PMCID: PMC406491
DOI: 10.1186/1471-2105-5-26

Computational protein biomarker prediction: a case study for prostate cancer

Michael Wagner et al. BMC Bioinformatics. 2004.

. 2004 Mar 11:5:26.

doi: 10.1186/1471-2105-5-26.

Authors

Michael Wagner¹, Dayanand N Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O John Semmes, George L Wright Jr

Affiliation

¹ Cincinnati Children's Hospital Research Foundation and Department of Biomedical Engineering, University of Cincinnati, Cincinnati, OH 45229, USA. mwagner@cchmc.org

PMID: 15113409
PMCID: PMC406491
DOI: 10.1186/1471-2105-5-26

Abstract

Background: Recent technological advances in mass spectrometry pose challenges in computational mathematics and statistics to process the mass spectral data into predictive models with clinical and biological significance. We discuss several classification-based approaches to finding protein biomarker candidates using protein profiles obtained via mass spectrometry, and we assess their statistical significance. Our overall goal is to implicate peaks that have a high likelihood of being biologically linked to a given disease state, and thus to narrow the search for biomarker candidates.

Results: Thorough cross-validation studies and randomization tests are performed on a prostate cancer dataset with over 300 patients, obtained at the Eastern Virginia Medical School using SELDI-TOF mass spectrometry. We obtain average classification accuracies of 87% on a four-group classification problem using a two-stage linear SVM-based procedure and just 13 peaks, with other methods performing comparably.

Conclusions: Modern feature selection and classification methods are powerful techniques for both the identification of biomarker candidates and the related problem of building predictive models from protein mass spectrometric profiles. Cross-validation and randomization are essential tools that must be performed carefully in order not to bias the results unfairly. However, only a biological validation and identification of the underlying proteins will ultimately confirm the actual value and power of any computational predictions.

PubMed Disclaimer

Figures

**Figure 1**
Accuracy and standard deviation estimates as a function of the number of cross-validation runs (shown, as an example, for the Fisher method with 15 peaks). Significant variability can be observed at the beginning, which motivates the need for a large number of runs in order to arrive at reasonable estimates.

See this image and copyright information in PMC

References

1. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Series in Statistics Springer; 2001.
1. Petricoin E, III, Ardekani A, Hitt B, Levine P, Fusaro V, Steinberg SM, Mills GB, Simone C, Fishman DA, Kohn EC, Liotta LA. Use of proteomic patterns in serum to identify ovarian cancer. The Lancet. 2002;359:572–577. doi: 10.1016/S0140-6736(02)07746-2. - DOI - PubMed
1. Sorace J, Zhan M. data review and re-assessment of ovarian cancer serum proteomic profiling. BMC Bioinformatics. 2003;4:24. doi: 10.1186/1471-2105-4-24. - DOI - PMC - PubMed
1. Li J, Zhang Z, Rosenzweig J, Wang YY, Chan DW. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem. 2002;48:1296–1304. - PubMed
1. Adam BL, Qu Y, Davis JW, Ward MD, Clements MA, Cazares LH, Semmes OJ, Schellhammer PF, Yasui Y, Feng Z, Wright GL. Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 2002;62:3609–3614. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

U01 CA085067/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Computational protein biomarker prediction: a case study for prostate cancer

Affiliation

Computational protein biomarker prediction: a case study for prostate cancer

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical