A face in the crowd: recognizing peptides through database search

Jimmy K Eng¹, Brian C Searle, Karl R Clauser, David L Tabb

Affiliations

PMID: 21876205
PMCID: PMC3226415
DOI: 10.1074/mcp.R111.009522

Review

A face in the crowd: recognizing peptides through database search

Jimmy K Eng et al. Mol Cell Proteomics. 2011 Nov.

. 2011 Nov;10(11):R111.009522.

doi: 10.1074/mcp.R111.009522. Epub 2011 Aug 29.

Authors

Jimmy K Eng¹, Brian C Searle, Karl R Clauser, David L Tabb

Affiliation

¹ University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA. engj@u.washington.edu

PMID: 21876205
PMCID: PMC3226415
DOI: 10.1074/mcp.R111.009522

Abstract

Peptide identification via tandem mass spectrometry sequence database searching is a key method in the array of tools available to the proteomics researcher. The ability to rapidly and sensitively acquire tandem mass spectrometry data and perform peptide and protein identifications has become a commonly used proteomics analysis technique because of advances in both instrumentation and software. Although many different tandem mass spectrometry database search tools are currently available from both academic and commercial sources, these algorithms share similar core elements while maintaining distinctive features. This review revisits the mechanism of sequence database searching and discusses how various parameter settings impact the underlying search.

PubMed Disclaimer

Figures

**Fig. 1.**
**A snapshot of MS/MS data acquisition.** A schematic showing the relationship between MS scans (*blue*) and MS/MS scans (*red*) in a typical tandem MS experiment. In this example, a precursor ion in each MS scan is isolated and fragmented, resulting in the MS/MS spectra that alternate every scan with the MS spectra. Typical data acquisition would acquire multiple MS/MS scans for each MS scan. In addition to the fragmentation pattern from each MS/MS spectrum used in a database search, the precursor ion's m/z, charge state, and mass accuracy of the measured precursor m/z are obtained from the MS spectrum.

**Fig. 2.**
**Data acquisition and database search.** For a given experimental MS/MS spectrum, protein sequences from a database are *in silico* digested and peptides of the right mass are selected. Theoretical fragment ions from each candidate peptide are calculated and used to generate a similarity or probability score by comparing the theoretical fragment ion masses against the experimental spectrum. Each candidate peptide is scored against the experimental spectrum and the best matching peptides and their scores are reported.

See this image and copyright information in PMC

Cited by

Quantitative affinity purification mass spectrometry: a versatile technology to study protein-protein interactions.
Meyer K, Selbach M. Meyer K, et al. Front Genet. 2015 Jul 14;6:237. doi: 10.3389/fgene.2015.00237. eCollection 2015. Front Genet. 2015. PMID: 26236332 Free PMC article. Review.
A chemical proteomics approach to profiling the ATP-binding proteome of Mycobacterium tuberculosis.
Wolfe LM, Veeraraghavan U, Idicula-Thomas S, Schürer S, Wennerberg K, Reynolds R, Besra GS, Dobos KM. Wolfe LM, et al. Mol Cell Proteomics. 2013 Jun;12(6):1644-60. doi: 10.1074/mcp.M112.025635. Epub 2013 Mar 5. Mol Cell Proteomics. 2013. PMID: 23462205 Free PMC article.
2018 YPIC Challenge: A Case Study in Characterizing an Unknown Protein Sample.
Pino L, Lin A, Bittremieux W. Pino L, et al. J Proteome Res. 2019 Nov 1;18(11):3936-3943. doi: 10.1021/acs.jproteome.9b00384. Epub 2019 Oct 7. J Proteome Res. 2019. PMID: 31556620 Free PMC article.
Multi-omics Visualization Platform: An extensible Galaxy plug-in for multi-omics data visualization and exploration.
McGowan T, Johnson JE, Kumar P, Sajulga R, Mehta S, Jagtap PD, Griffin TJ. McGowan T, et al. Gigascience. 2020 Apr 1;9(4):giaa025. doi: 10.1093/gigascience/giaa025. Gigascience. 2020. PMID: 32236523 Free PMC article.
HyperSpec: Ultrafast Mass Spectra Clustering in Hyperdimensional Space.
Xu W, Kang J, Bittremieux W, Moshiri N, Rosing T. Xu W, et al. J Proteome Res. 2023 Jun 2;22(6):1639-1648. doi: 10.1021/acs.jproteome.2c00612. Epub 2023 May 11. J Proteome Res. 2023. PMID: 37166120 Free PMC article.

See all "Cited by" articles

References

1. Mann M., Wilm M. (1994) Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399 - PubMed
1. Tanner S., Shu H., Frank A., Wang L. C., Zandi E., Mumby M., Pevzner P. A., Bafna V. (2005) InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77, 4626–4639 - PubMed
1. Liu C., Yan B., Song Y., Xu Y., Cai L. (2006) Peptide sequence tag-based blind identification of post-translational modifications with point process model. Bioinformatics 22, e307–313 - PubMed
1. Tabb D. L., Ma Z. Q., Martin D. B., Ham A. J., Chambers M. C. (2008) DirecTag: accurate sequence tags from peptide MS/MS through statistical scoring. J. Proteome Res. 7, 3838–3846 - PMC - PubMed
1. Eng J. K., McCormack A. L., Yates J. R., 3rd (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectr. 5, 976–989 - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A face in the crowd: recognizing peptides through database search

Affiliation

A face in the crowd: recognizing peptides through database search

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources