Spectral dictionaries: Integrating de novo peptide sequencing with database search of tandem mass spectra
- PMID: 18703573
- PMCID: PMC2621003
- DOI: 10.1074/mcp.M800103-MCP200
Spectral dictionaries: Integrating de novo peptide sequencing with database search of tandem mass spectra
Abstract
Database search tools identify peptides by matching tandem mass spectra against a protein database. We study an alternative approach when all plausible de novo interpretations of a spectrum (spectral dictionary) are generated and then quickly matched against the database. We present a new MS-Dictionary algorithm for efficiently generating spectral dictionaries and demonstrate that MS-Dictionary can identify spectra that are missed in the database search. We argue that MS-Dictionary enables proteogenomics searches in six-frame translation of genomic sequences that may be prohibitively time-consuming for existing database search approaches. We show that such searches allow one to correct sequencing errors and find programmed frameshifts.
Figures












Similar articles
-
Gapped spectral dictionaries and their applications for database searches of tandem mass spectra.Mol Cell Proteomics. 2011 Jun;10(6):M110.002220. doi: 10.1074/mcp.M110.002220. Epub 2011 Mar 28. Mol Cell Proteomics. 2011. PMID: 21444829 Free PMC article.
-
De Novo Sequencing of Peptides from Tandem Mass Spectra and Applications in Proteogenomics.Methods Mol Biol. 2025;2859:1-19. doi: 10.1007/978-1-0716-4152-1_1. Methods Mol Biol. 2025. PMID: 39436593
-
Algorithms for the de novo sequencing of peptides from tandem mass spectra.Expert Rev Proteomics. 2011 Oct;8(5):645-57. doi: 10.1586/epr.11.54. Expert Rev Proteomics. 2011. PMID: 21999834 Review.
-
pNovo: de novo peptide sequencing and identification using HCD spectra.J Proteome Res. 2010 May 7;9(5):2713-24. doi: 10.1021/pr100182k. J Proteome Res. 2010. PMID: 20329752
-
The spectral networks paradigm in high throughput mass spectrometry.Mol Biosyst. 2012 Oct;8(10):2535-44. doi: 10.1039/c2mb25085c. Mol Biosyst. 2012. PMID: 22610447 Free PMC article. Review.
Cited by
-
Using SEQUEST with theoretically complete sequence databases.J Am Soc Mass Spectrom. 2015 Nov;26(11):1858-64. doi: 10.1007/s13361-015-1228-5. Epub 2015 Aug 4. J Am Soc Mass Spectrom. 2015. PMID: 26238326 Free PMC article.
-
Identification of ultramodified proteins using top-down tandem mass spectra.J Proteome Res. 2013 Dec 6;12(12):5830-8. doi: 10.1021/pr400849y. Epub 2013 Nov 15. J Proteome Res. 2013. PMID: 24188097 Free PMC article.
-
Cycloquest: identification of cyclopeptides via database search of their mass spectra against genome databases.J Proteome Res. 2011 Oct 7;10(10):4505-12. doi: 10.1021/pr200323a. Epub 2011 Sep 7. J Proteome Res. 2011. PMID: 21851130 Free PMC article.
-
Peptide de novo sequencing of mixture tandem mass spectra.Proteomics. 2016 Sep;16(18):2470-9. doi: 10.1002/pmic.201500549. Epub 2016 Aug 5. Proteomics. 2016. PMID: 27329701 Free PMC article.
-
NPS: scoring and evaluating the statistical significance of peptidic natural product-spectrum matches.Bioinformatics. 2019 Jul 15;35(14):i315-i323. doi: 10.1093/bioinformatics/btz374. Bioinformatics. 2019. PMID: 31510666 Free PMC article.
References
-
- Mann, M., and Wilm, M. ( 1994) Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399 - PubMed
-
- Tanner, S., Shu, H., Frank, A., Wang, L., Zandi, E., Mumby, M., Pevzner, P., and Bafna, V. ( 2005) InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77, 4626–4639 - PubMed
-
- Shilov, I., Seymour, S., Patel, A., Loboda, A., Tang, W., Keating, S., Hunter, C., Nuwaysir, L., and Schaeffer, D. ( 2007) The Paragon Algorithm: a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra. Mol. Cell. Proteomics 6, 1638–1655 - PubMed
-
- Frank, A., Tanner, S., Bafna, V., and Pevzner, P. ( 2005) Peptide sequence tags for fast database search in mass-spectrometry. J. Proteome Res. 4, 1287–1295 - PubMed
-
- Liu, C., Yan, B., Song, Y., Xu, Y., and Cai, L. ( 2006) Peptide sequence tag-based blind identification of post-translational modifications with point process model. Bioinformatics 22, e307–e313 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources