Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information
- PMID: 23163785
- DOI: 10.1021/pr300781t
Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information
Abstract
Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .
Similar articles
-
A dynamic wavelet-based algorithm for pre-processing tandem mass spectrometry data.Bioinformatics. 2010 Sep 15;26(18):2242-9. doi: 10.1093/bioinformatics/btq403. Epub 2010 Jul 13. Bioinformatics. 2010. PMID: 20628072
-
MassWiz: a novel scoring algorithm with target-decoy based analysis pipeline for tandem mass spectrometry.J Proteome Res. 2011 May 6;10(5):2154-60. doi: 10.1021/pr200031z. Epub 2011 Apr 5. J Proteome Res. 2011. PMID: 21417338
-
Oscore: a combined score to reduce false negative rates for peptide identification in tandem mass spectrometry analysis.J Mass Spectrom. 2009 Jan;44(1):25-31. doi: 10.1002/jms.1466. J Mass Spectrom. 2009. PMID: 18698557
-
Improving protein identification from tandem mass spectrometry data by one-step methods and integrating data from other platforms.Brief Bioinform. 2016 Mar;17(2):262-9. doi: 10.1093/bib/bbv043. Epub 2015 Jul 3. Brief Bioinform. 2016. PMID: 26141827 Free PMC article. Review.
-
Overview of tandem mass spectrometry (MS/MS) database search algorithms.Curr Protoc Protein Sci. 2007 Aug;Chapter 25:25.2.1-25.2.19. doi: 10.1002/0471140864.ps2502s49. Curr Protoc Protein Sci. 2007. PMID: 18429324 Review.
Cited by
-
Transfer RNAs Mediate the Rapid Adaptation of Escherichia coli to Oxidative Stress.PLoS Genet. 2015 Jun 19;11(6):e1005302. doi: 10.1371/journal.pgen.1005302. eCollection 2015 Jun. PLoS Genet. 2015. PMID: 26090660 Free PMC article.
-
CDK12 and PAK2 as novel therapeutic targets for human gastric cancer.Theranostics. 2020 May 15;10(14):6201-6215. doi: 10.7150/thno.46137. eCollection 2020. Theranostics. 2020. PMID: 32483448 Free PMC article.
-
Dispec: a novel peptide scoring algorithm based on peptide matching discriminability.PLoS One. 2013 May 13;8(5):e62724. doi: 10.1371/journal.pone.0062724. Print 2013. PLoS One. 2013. PMID: 23675420 Free PMC article.
-
Functional module search in protein networks based on semantic similarity improves the analysis of proteomics data.Mol Cell Proteomics. 2014 Jul;13(7):1877-89. doi: 10.1074/mcp.M113.032839. Epub 2014 May 7. Mol Cell Proteomics. 2014. PMID: 24807868 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources