False discovery rates of protein identifications: a strike against the two-peptide rule

Nitin Gupta¹, Pavel A Pevzner

Affiliations

PMID: 19627159
PMCID: PMC3398614
DOI: 10.1021/pr9004794

False discovery rates of protein identifications: a strike against the two-peptide rule

Nitin Gupta et al. J Proteome Res. 2009 Sep.

. 2009 Sep;8(9):4173-81.

doi: 10.1021/pr9004794.

Authors

Nitin Gupta¹, Pavel A Pevzner

Affiliation

¹ Bioinformatics Program and Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA. ngupta@ucsd.edu

PMID: 19627159
PMCID: PMC3398614
DOI: 10.1021/pr9004794

Abstract

Most proteomics studies attempt to maximize the number of peptide identifications and subsequently infer proteins containing two or more peptides as reliable protein identifications. In this study, we evaluate the effect of this "two-peptide" rule on protein identifications, using multiple search tools and data sets. Contrary to the intuition, the "two-peptide" rule reduces the number of protein identifications in the target database more significantly than in the decoy database and results in increased false discovery rates, compared to the case when single-hit proteins are not discarded. We therefore recommend that the "two-peptide" rule should be abandoned, and instead, protein identifications should be subject to the estimation of error rates, as is the case with peptide identifications. We further extend the generating function approach (originally proposed for evaluating matches between a peptide and a single spectrum) to evaluating matches between a protein and an entire spectral data set.

PubMed Disclaimer

Figures

**Figure 1**
Identification of peptides in the *Shewanella* data set using different approaches and scoring functions. Each point in the curves is generated by varying the scoring threshold and computing the number of hits in the target and the decoy database exceeding the threshold.

**Figure 2**
(a) Identification of proteins in the human data set using different approaches and scoring functions. (b) Similar plot as in panel (a) for *Shewanella* data set.

**Figure 3**
(a) Protein identification in the human data set using X!Tandem search results with different scoring approaches at the protein level. (b) Similar plot as in panel (a) for an arbitrarily selected subset of *Shewanella* data set containing 1.25 million spectra.

**Figure 4**
Identification of proteins, using the unique peptides only (peptides that are not shared between multiple proteins), in the human data set using InsPecT search results with different approaches.

**Figure 5**
(a) Identification of proteins in the human data set using MS-GF scores, without and with length correction. (b) Similar plot as in panel (a) for *Shewanella* data set.

See this image and copyright information in PMC

Cited by

Cannabidiol and Tetrahydrocannabinol Antinociceptive Activity is Mediated by Distinct Receptors in Caenorhabditis elegans.
Boujenoui F, Nkambeu B, Salem JB, Castano Uruena JD, Beaudry F. Boujenoui F, et al. Neurochem Res. 2024 Apr;49(4):935-948. doi: 10.1007/s11064-023-04069-6. Epub 2023 Dec 23. Neurochem Res. 2024. PMID: 38141130
Identification of cross-reactive IgE-binding proteins from Philippine allergenic grass pollen extracts.
Castor MAR, Cruz MKDM, Balanag GAM, Hate KM, Reyes RDC, Agcaoili-De Jesus MS, Ocampo-Cervantes CC, Dalmacio LMM. Castor MAR, et al. Asia Pac Allergy. 2024 Aug;14(3):108-117. doi: 10.5415/apallergy.0000000000000155. Epub 2024 Aug 5. Asia Pac Allergy. 2024. PMID: 39220572 Free PMC article.
Improved Differential Diagnosis of Alzheimer's Disease by Integrating ELISA and Mass Spectrometry-Based Cerebrospinal Fluid Biomarkers.
Khoonsari PE, Shevchenko G, Herman S, Remnestål J, Giedraitis V, Brundin R, Degerman Gunnarsson M, Kilander L, Zetterberg H, Nilsson P, Lannfelt L, Ingelsson M, Kultima K. Khoonsari PE, et al. J Alzheimers Dis. 2019;67(2):639-651. doi: 10.3233/JAD-180855. J Alzheimers Dis. 2019. PMID: 30614806 Free PMC article.
Cannflavins isolated from Cannabis sativa impede Caenorhabditis elegans response to noxious heat.
Lahaise M, Boujenoui F, Beaudry F. Lahaise M, et al. Naunyn Schmiedebergs Arch Pharmacol. 2024 Jan;397(1):535-548. doi: 10.1007/s00210-023-02621-3. Epub 2023 Jul 22. Naunyn Schmiedebergs Arch Pharmacol. 2024. PMID: 37480489
Anandamide Modulates Thermal Avoidance in Caenorhabditis elegans Through Vanilloid and Cannabinoid Receptor Interplay.
Abdollahi M, Castaño JD, Salem JB, Beaudry F. Abdollahi M, et al. Neurochem Res. 2024 Sep;49(9):2423-2439. doi: 10.1007/s11064-024-04186-w. Epub 2024 Jun 7. Neurochem Res. 2024. PMID: 38847909

See all "Cited by" articles

References

1. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422:198–207. - PubMed
1. Cargile BJ, Bundy JL, Stephenson JL., Jr Potential for false positive identifications from large databases through tandem mass spectrometry. J. Proteome Res. 2004;3:1082–1085. - PubMed
1. Elias JE, Gygi SP. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods. 2007;4:207–214. - PubMed
1. Kall L, Storey JD, MacCoss MJ, Noble SW. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J. Proteome Res. 2008;7:29–34. - PubMed
1. Omenn GS, States DJ, Adamski M, Blackwell TW, Menon R, Hermjakob H, Apweiler R, Haab BB, Simpson RJ, Eddes JS. Overview of the HUPO Plasma Proteome Project: results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database. Proteomics. 2005;5:3226–3245. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

False discovery rates of protein identifications: a strike against the two-peptide rule

Affiliation

False discovery rates of protein identifications: a strike against the two-peptide rule

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources