Two-phase Filtering Strategy for Efficient Peptide Identification from Mass Spectrometry
- PMID: 20717493
- PMCID: PMC2921811
- DOI: 10.4172/jpb.1000130
Two-phase Filtering Strategy for Efficient Peptide Identification from Mass Spectrometry
Abstract
Peptide identification by tandem mass spectrometry (MS/MS) is one of the most important problems in proteomics. Recent advances in high throughput MS/MS experiments result in huge amount of spectra, and the peptide identification process should keep pace. In this paper, we strive to achieve high accuracy and efficiency for peptide identification with the presence of noise by a two-phase filtering strategy. Our algorithm transforms spectra to high dimensional vectors, and then uses self-organizing map (SOM) and multi-point range query (MPRQ) as very efficient coarse filters to select a number of candidate peptides from database. These candidate peptides are subsequently scored and ranked by an accurate tag-based scoring function S(λ). Experiments showed that our approach is both fast and accurate for peptide identification.
Figures




Similar articles
-
PepSOM: an algorithm for peptide identification by tandem mass spectrometry based on SOM.Genome Inform. 2006;17(2):194-205. Genome Inform. 2006. PMID: 17503392
-
An accurate and efficient algorithm for Peptide and ptm identification by tandem mass spectrometry.Genome Inform. 2007;19:119-30. Genome Inform. 2007. PMID: 18546510
-
An SVM scorer for more sensitive and reliable peptide identification via tandem mass spectrometry.Pac Symp Biocomput. 2006:303-14. Pac Symp Biocomput. 2006. PMID: 17094248
-
Filtering strategies for improving protein identification in high-throughput MS/MS studies.Proteomics. 2009 Feb;9(4):848-60. doi: 10.1002/pmic.200800517. Proteomics. 2009. PMID: 19160393 Review.
-
Protein identification by tandem mass spectrometry and sequence database searching.Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87. Methods Mol Biol. 2007. PMID: 17185772 Review.
References
-
- Bertone P, Gerstein M. Integrative data mining: the new direction in bioinformatics. IEEE Engineering in Medicine and Biology Magazine. 2001;20:33–40. » CrossRef » PubMed » Google Scholar. - PubMed
-
- Dancik V, Addona T, Clauser K, Vath J, Pevzner P. De novo protein sequencing via tandem mass-spectrometry. J Comp Biol. 1999;6:327–341. » CrossRef » PubMed » Google Scholar. - PubMed
-
- Desiere F, Deutsch EW, King NL, Nesvizhskii AI, Mallick P, et al. The PeptideAtlas Project. Nucleic Acids Research. 2006;34:D655–D658. » CrossRef » PubMed » Google Scholar. - PMC - PubMed
-
- Eng JK, McCormack AL, John R, Yates I. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. JASMS. 1994;5:976–989. » CrossRef » PubMed » Google Scholar. - PubMed
-
- Frank A, Pevzner P. PepNovo: De Novo Peptide Sequencing via Probabilistic Network Modeling. Anal Chem. 2005;77:964–973. » CrossRef » PubMed » Google Scholar. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources