Peptide sequence tag-based blind identification of post-translational modifications with point process model
- PMID: 16873487
- DOI: 10.1093/bioinformatics/btl226
Peptide sequence tag-based blind identification of post-translational modifications with point process model
Abstract
An important but difficult problem in proteomics is the identification of post-translational modifications (PTMs) in a protein. In general, the process of PTM identification by aligning experimental spectra with theoretical spectra from peptides in a peptide database is very time consuming and may lead to high false positive rate. In this paper, we introduce a new approach that is both efficient and effective for blind PTM identification. Our work consists of the following phases. First, we develop a novel tree decomposition based algorithm that can efficiently generate peptide sequence tags (PSTs) from an extended spectrum graph. Sequence tags are selected from all maximum weighted antisymmetric paths in the graph and their reliabilities are evaluated with a score function. An efficient deterministic finite automaton (DFA) based model is then developed to search a peptide database for candidate peptides by using the generated sequence tags. Finally, a point process model-an efficient blind search approach for PTM identification, is applied to report the correct peptide and PTMs if there are any. Our tests on 2657 experimental tandem mass spectra and 2620 experimental spectra with one artificially added PTM show that, in addition to high efficiency, our ab-initio sequence tag selection algorithm achieves better or comparable accuracy to other approaches. Database search results show that the sequence tags of lengths 3 and 4 filter out more than 98.3% and 99.8% peptides respectively when applied to a yeast peptide database. With the dramatically reduced search space, the point process model achieves significant improvement in accuracy as well.
Availability: The software is available upon request.
Similar articles
-
Identification of post-translational modifications by blind search of mass spectra.Nat Biotechnol. 2005 Dec;23(12):1562-7. doi: 10.1038/nbt1168. Epub 2005 Nov 27. Nat Biotechnol. 2005. PMID: 16311586
-
A suffix tree approach to the interpretation of tandem mass spectra: applications to peptides of non-specific digestion and post-translational modifications.Bioinformatics. 2003 Oct;19 Suppl 2:ii113-21. doi: 10.1093/bioinformatics/btg1068. Bioinformatics. 2003. PMID: 14534180
-
PepSOM: an algorithm for peptide identification by tandem mass spectrometry based on SOM.Genome Inform. 2006;17(2):194-205. Genome Inform. 2006. PMID: 17503392
-
Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.Nat Methods. 2004 Dec;1(3):195-202. doi: 10.1038/nmeth725. Nat Methods. 2004. PMID: 15789030 Review.
-
Protein identification by tandem mass spectrometry and sequence database searching.Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87. Methods Mol Biol. 2007. PMID: 17185772 Review.
Cited by
-
Quantitative proteomic analysis of histone modifications.Chem Rev. 2015 Mar 25;115(6):2376-418. doi: 10.1021/cr500491u. Epub 2015 Feb 17. Chem Rev. 2015. PMID: 25688442 Free PMC article. Review. No abstract available.
-
DeltAMT: a statistical algorithm for fast detection of protein modifications from LC-MS/MS data.Mol Cell Proteomics. 2011 May;10(5):M110.000455. doi: 10.1074/mcp.M110.000455. Epub 2011 Feb 14. Mol Cell Proteomics. 2011. PMID: 21321130 Free PMC article.
-
Overcoming species boundaries in peptide identification with Bayesian information criterion-driven error-tolerant peptide search (BICEPS).Mol Cell Proteomics. 2012 Jul;11(7):M111.014167. doi: 10.1074/mcp.M111.014167. Epub 2012 Apr 6. Mol Cell Proteomics. 2012. PMID: 22493179 Free PMC article.
-
Liquid Chromatography Mass Spectrometry-Based Proteomics: Biological and Technological Aspects.Ann Appl Stat. 2010;4(4):1797-1823. doi: 10.1214/10-AOAS341. Ann Appl Stat. 2010. PMID: 21593992 Free PMC article.
-
PILOT_PROTEIN: identification of unmodified and modified proteins via high-resolution mass spectrometry and mixed-integer linear optimization.J Proteome Res. 2012 Sep 7;11(9):4615-29. doi: 10.1021/pr300418j. Epub 2012 Jul 26. J Proteome Res. 2012. PMID: 22788846 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
Miscellaneous