A novel algorithm for validating peptide identification from a shotgun proteomics search engine

Ling Jian¹, Xinnan Niu, Zhonghang Xia, Parimal Samir, Chiranthani Sumanasekera, Zheng Mu, Jennifer L Jennings, Kristen L Hoek, Tara Allos, Leigh M Howard, Kathryn M Edwards, P Anthony Weil, Andrew J Link

Affiliations

PMID: 23402659
PMCID: PMC3608465
DOI: 10.1021/pr300631t

A novel algorithm for validating peptide identification from a shotgun proteomics search engine

Ling Jian et al. J Proteome Res. 2013.

. 2013 Mar 1;12(3):1108-19.

doi: 10.1021/pr300631t. Epub 2013 Feb 12.

Authors

Affiliation

¹ Department of Pathology, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, United States.

PMID: 23402659
PMCID: PMC3608465
DOI: 10.1021/pr300631t

Abstract

Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has revolutionized the proteomics analysis of complexes, cells, and tissues. In a typical proteomic analysis, the tandem mass spectra from a LC-MS/MS experiment are assigned to a peptide by a search engine that compares the experimental MS/MS peptide data to theoretical peptide sequences in a protein database. The peptide spectra matches are then used to infer a list of identified proteins in the original sample. However, the search engines often fail to distinguish between correct and incorrect peptides assignments. In this study, we designed and implemented a novel algorithm called De-Noise to reduce the number of incorrect peptide matches and maximize the number of correct peptides at a fixed false discovery rate using a minimal number of scoring outputs from the SEQUEST search engine. The novel algorithm uses a three-step process: data cleaning, data refining through a SVM-based decision function, and a final data refining step based on proteolytic peptide patterns. Using proteomics data generated on different types of mass spectrometers, we optimized the De-Noise algorithm on the basis of the resolution and mass accuracy of the mass spectrometer employed in the LC-MS/MS experiment. Our results demonstrate De-Noise improves peptide identification compared to other methods used to process the peptide sequence matches assigned by SEQUEST. Because De-Noise uses a limited number of scoring attributes, it can be easily implemented with other search engines.

PubMed Disclaimer

Figures

**Figure 1**
Pseudocode for the De-Noise algorithm.

**Figure 2**
Venn diagrams for the seven datasets showing the number of overlapping validated PSMs from De-Noise, PeptidePropheet, and Percolator. An FDR of 0.05 was used for all three approaches.

**Figure 3**
Plots of target PSM hits for the seven datasets validated under a series of FDRs for De-Noise, PeptideProphet, and Percolator. The number of target peptide hits is plotted for a FDR range from 0.01 to 0.1. (A) Gcn4 LCQ (B) UPS1 LTQ (C) Tal08 LTQ-Orbitrap XL MiPS (D) PBMC LTQ-Orbitrap XL MiPS (E) PBMC LTQ-Orbitrap XL MiPS-off (F) PBMC LTQ-Orbitrap Velos MiPS (G) PBMC LTQ-Orbitrap Velos MiPS-off.

**Figure 4**
ROC curves for the seven datasets showing the validation performance of De-Noise, PeptideProphet, and Percolator. (A) Gcn4 LCQ (B) UPS1 LTQ (C) Tal08 LTQ-Orbitrap XL MiPS-off (D) PBMC LTQ Orbitrap XL MiPS (E) PBMC LTQ-Orbitrap XL MiPS-off (F) PBMC LTQ-Orbitrap Velos MiPS (G) PBMC LTQ-Orbitrap Velos MiPS-off.

See this image and copyright information in PMC

References

1. Elias JE, Haas W, Faherty BK, Gygi SP. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat Methods. 2005;2:667–675. - PubMed
1. Elias JE, Gygi SP. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods. 2007;4:207–214. - PubMed
1. Peng J, Elias JE, Thoreen CC, Licklider LJ, Gygi SP. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Proteome Res. 2003;2:43–50. - PubMed
1. Kall L, Storey JD, MacCoss MJ, Noble WS. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res. 2008;7:29–34. - PubMed
1. Choi H, Nesvizhskii AI. False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res. 2008;7:47–50. - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A novel algorithm for validating peptide identification from a shotgun proteomics search engine

Affiliation

A novel algorithm for validating peptide identification from a shotgun proteomics search engine

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources