RAId_DbS: peptide identification using database searches with realistic statistics
- PMID: 17961253
- PMCID: PMC2211744
- DOI: 10.1186/1745-6150-2-25
RAId_DbS: peptide identification using database searches with realistic statistics
Abstract
Background: The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides.
Results: Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request.
Figures




Similar articles
-
Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics.Bioinformatics. 2005 Oct 1;21(19):3726-32. doi: 10.1093/bioinformatics/bti620. Epub 2005 Aug 16. Bioinformatics. 2005. PMID: 16105903
-
Calibrating E-values for MS2 database search methods.Biol Direct. 2007 Nov 5;2:26. doi: 10.1186/1745-6150-2-26. Biol Direct. 2007. PMID: 17983478 Free PMC article.
-
RAId_aPS: MS/MS analysis with multiple scoring functions and spectrum-specific statistics.PLoS One. 2010 Nov 16;5(11):e15438. doi: 10.1371/journal.pone.0015438. PLoS One. 2010. PMID: 21103371 Free PMC article.
-
Protein identification by tandem mass spectrometry and sequence database searching.Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87. Methods Mol Biol. 2007. PMID: 17185772 Review.
-
Algorithms for the de novo sequencing of peptides from tandem mass spectra.Expert Rev Proteomics. 2011 Oct;8(5):645-57. doi: 10.1586/epr.11.54. Expert Rev Proteomics. 2011. PMID: 21999834 Review.
Cited by
-
A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.J Proteomics. 2010 Oct 10;73(11):2092-123. doi: 10.1016/j.jprot.2010.08.009. Epub 2010 Sep 8. J Proteomics. 2010. PMID: 20816881 Free PMC article. Review.
-
Assigning spectrum-specific P-values to protein identifications by mass spectrometry.Bioinformatics. 2011 Apr 15;27(8):1128-34. doi: 10.1093/bioinformatics/btr089. Epub 2011 Feb 23. Bioinformatics. 2011. PMID: 21349864 Free PMC article.
-
A graphical user interface for RAId, a knowledge integrated proteomics analysis suite with accurate statistics.BMC Res Notes. 2018 Mar 15;11(1):182. doi: 10.1186/s13104-018-3289-6. BMC Res Notes. 2018. PMID: 29544540 Free PMC article.
-
SweetSEQer, simple de novo filtering and annotation of glycoconjugate mass spectra.Mol Cell Proteomics. 2013 Jun;12(6):1735-40. doi: 10.1074/mcp.O112.025940. Epub 2013 Feb 26. Mol Cell Proteomics. 2013. PMID: 23443135 Free PMC article.
-
Detection of co-eluted peptides using database search methods.Biol Direct. 2008 Jul 2;3:27. doi: 10.1186/1745-6150-3-27. Biol Direct. 2008. PMID: 18597684 Free PMC article.
References
-
- Clauser KR, Baker PR, Burlingame AL. Peptide fragment-ion tags from maldi/psd for error tolerant searching of genomic databases. Proceedings of the 44th ASMS Conference on Mass Spectrometry and Allied Topics: 12–16 May 1996; Portland, Oregan. 1996. p. 365.
-
- Bafna V, Edwards N. SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database. Bioinformatics. 2001;17:S13–S21. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources