Protocols for disease classification from mass spectrometry data

Michael Wagner¹, Dayanand Naik, Alex Pothen

Affiliations

PMID: 12973727
DOI: 10.1002/pmic.200300519

Protocols for disease classification from mass spectrometry data

Michael Wagner et al. Proteomics. 2003 Sep.

. 2003 Sep;3(9):1692-8.

doi: 10.1002/pmic.200300519.

Authors

Michael Wagner¹, Dayanand Naik, Alex Pothen

Affiliation

¹ Pediatric Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.

PMID: 12973727
DOI: 10.1002/pmic.200300519

Abstract

We report our results in classifying protein matrix-assisted laser desorption/ionization-time of flight mass spectra obtained from serum samples into diseased and healthy groups. We discuss in detail five of the steps in preprocessing the mass spectral data for biomarker discovery, as well as our criterion for choosing a small set of peaks for classifying the samples. Cross-validation studies with four selected proteins yielded misclassification rates in the 10-15% range for all the classification methods. Three of these proteins or protein fragments are down-regulated and one up-regulated in lung cancer, the disease under consideration in this data set. When cross-validation studies are performed, care must be taken to ensure that the test set does not influence the choice of the peaks used in the classification. Misclassification rates are lower when both the training and test sets are used to select the peaks used in classification versus when only the training set is used. This expectation was validated for various statistical discrimination methods when thirteen peaks were used in cross-validation studies. One particular classification method, a linear support vector machine, exhibited especially robust performance when the number of peaks was varied from four to thirteen, and when the peaks were selected from the training set alone. Experiments with the samples randomly assigned to the two classes confirmed that misclassification rates were significantly higher in such cases than those observed with the true data. This indicates that our findings are indeed significant. We found closely matching masses in a database for protein expression in lung cancer for three of the four proteins we used to classify lung cancer. Data from additional samples, increased experience with the performance of various preprocessing techniques, and affirmation of the biological roles of the proteins that help in classification, will strengthen our conclusions in the future.

PubMed Disclaimer

Cited by

Correcting common errors in identifying cancer-specific serum peptide signatures.
Villanueva J, Philip J, Chaparro CA, Li Y, Toledo-Crow R, DeNoyer L, Fleisher M, Robbins RJ, Tempst P. Villanueva J, et al. J Proteome Res. 2005 Jul-Aug;4(4):1060-72. doi: 10.1021/pr050034b. J Proteome Res. 2005. PMID: 16083255 Free PMC article.
Mass spectrometry and multivariate analysis to classify cervical intraepithelial neoplasia from blood plasma: an untargeted lipidomic study.
Neves ACO, Morais CLM, Mendes TPP, Vaz BG, Lima KMG. Neves ACO, et al. Sci Rep. 2018 Mar 2;8(1):3954. doi: 10.1038/s41598-018-22317-6. Sci Rep. 2018. PMID: 29500376 Free PMC article.
Bootstrap classification and point-based feature selection from age-staged mouse cerebellum tissues of matrix assisted laser desorption/ionization mass spectra using a fuzzy rule-building expert system.
Harrington PB, Laurent C, Levinson DF, Levitt P, Markey SP. Harrington PB, et al. Anal Chim Acta. 2007 Sep 19;599(2):219-31. doi: 10.1016/j.aca.2007.08.007. Epub 2007 Aug 6. Anal Chim Acta. 2007. PMID: 17870284 Free PMC article.
Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis.
Griffin NM, Yu J, Long F, Oh P, Shore S, Li Y, Koziol JA, Schnitzer JE. Griffin NM, et al. Nat Biotechnol. 2010 Jan;28(1):83-9. doi: 10.1038/nbt.1592. Epub 2009 Dec 13. Nat Biotechnol. 2010. PMID: 20010810 Free PMC article.
Classification of astrocytomas and oligodendrogliomas from mass spectrometry data using sparse kernel machines.
Huang J, Gholami B, Agar NY, Norton I, Haddad WM, Tannenbaum AR. Huang J, et al. Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:7965-8. doi: 10.1109/IEMBS.2011.6091964. Annu Int Conf IEEE Eng Med Biol Soc. 2011. PMID: 22256188 Free PMC article.

See all "Cited by" articles

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
- Wiley
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Protocols for disease classification from mass spectrometry data

Affiliation

Protocols for disease classification from mass spectrometry data

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources