Tree-based disease classification using protein data
- PMID: 12973723
- DOI: 10.1002/pmic.200300520
Tree-based disease classification using protein data
Abstract
A reliable and precise classification of diseases is essential for successful diagnosis and treatment. Using mass spectrometry from clinical specimens, scientists may find the protein variations among disease and use this information to improve diagnosis. In this paper, we propose a novel procedure to classify disease status based on the protein data from mass spectrometry. Our new tree-based algorithm consists of three steps: projection, selection and classification tree. The projection step aims to project all observations from specimens into the same bases so that the projected data have fixed coordinates. Thus, for each specimen, we obtain a large vector of 'coefficients' on the same basis. The purpose of the selection step is data reduction by condensing the large vector from the projection step into a much lower order of informative vector. Finally, using these reduced vectors, we apply recursive partitioning to construct an informative classification tree. This method has been successfully applied to protein data, provided by the Department of Radiology and Chemistry at Duke University.
Similar articles
-
Decision tree classification of proteins identified by mass spectrometry of blood serum samples from people with and without lung cancer.Proteomics. 2003 Sep;3(9):1678-9. doi: 10.1002/pmic.200300521. Proteomics. 2003. PMID: 12973724
-
Peak tree: a new tool for multiscale hierarchical representation and peak detection of mass spectrometry data.IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):1054-66. doi: 10.1109/TCBB.2009.56. IEEE/ACM Trans Comput Biol Bioinform. 2011. PMID: 21566254
-
Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method.Proteomics. 2003 Sep;3(9):1680-6. doi: 10.1002/pmic.200300515. Proteomics. 2003. PMID: 12973725
-
Feature selection and nearest centroid classification for protein mass spectrometry.BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68. BMC Bioinformatics. 2005. PMID: 15788095 Free PMC article.
-
Multiple approaches to data-mining of proteomic data based on statistical and pattern classification methods.Proteomics. 2003 Sep;3(9):1704-9. doi: 10.1002/pmic.200300512. Proteomics. 2003. PMID: 12973729
Cited by
-
Intelligence Algorithms for Protein Classification by Mass Spectrometry.Biomed Res Int. 2018 Nov 11;2018:2862458. doi: 10.1155/2018/2862458. eCollection 2018. Biomed Res Int. 2018. PMID: 30534555 Free PMC article. Review.
-
Processing MALDI Mass Spectra to Improve Mass Spectral Direct Tissue Analysis.Int J Mass Spectrom. 2007 Feb 1;260(2-3):212-221. doi: 10.1016/j.ijms.2006.10.005. Int J Mass Spectrom. 2007. PMID: 17541451 Free PMC article.
-
Embryonic stem cell interactomics: the beginning of a long road to biological function.Stem Cell Rev Rep. 2012 Dec;8(4):1138-54. doi: 10.1007/s12015-012-9400-9. Stem Cell Rev Rep. 2012. PMID: 22847281 Review.
-
Parametric power spectral density analysis of noise from instrumentation in MALDI TOF mass spectrometry.Cancer Inform. 2007 Sep 17;3:219-30. Cancer Inform. 2007. PMID: 19455245 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources