Reducing the haystack to find the needle: improved protein identification after fast elimination of non-interpretable peptide MS/MS spectra and noise reduction
- PMID: 20158870
- PMCID: PMC2822527
- DOI: 10.1186/1471-2164-11-S1-S13
Reducing the haystack to find the needle: improved protein identification after fast elimination of non-interpretable peptide MS/MS spectra and noise reduction
Abstract
Background: Tandem mass spectrometry (MS/MS) has become a standard method for identification of proteins extracted from biological samples but the huge number and the noise contamination of MS/MS spectra obstruct swift and reliable computer-aided interpretation. Typically, a minor fraction of the spectra per sample (most often, only a few %) and about 10% of the peaks per spectrum contribute to the final result if protein identification is not prevented by the noise at all.
Results: Two fast preprocessing screens can substantially reduce the haystack of MS/MS data. (1) Simple sequence ladder rules remove spectra non-interpretable in peptide sequences. (2) Modified Fourier-transform-based criteria clear background in the remaining data. In average, only a remainder of 35% of the MS/MS spectra (each reduced in size by about one quarter) has to be handed over to the interpretation software for reliable protein identification essentially without loss of information, with a trend to improved sequence coverage and with proportional decrease of computer resource consumption.
Conclusions: The search for sequence ladders in tandem MS/MS spectra with subsequent noise suppression is a promising strategy to reduce the number of MS/MS spectra from electro-spray instruments and to enhance the reliability of protein matches. Supplementary material and the software are available from an accompanying WWW-site with the URL http://mendel.bii.a-star.edu.sg/mass-spectrometry/MSCleaner-2.0/.
Similar articles
-
Preprocessing Tandem Mass Spectra Using Genetic Programming for Peptide Identification.J Am Soc Mass Spectrom. 2019 Jul;30(7):1294-1307. doi: 10.1007/s13361-019-02196-5. Epub 2019 Apr 25. J Am Soc Mass Spectrom. 2019. PMID: 31025295
-
When less can yield more - Computational preprocessing of MS/MS spectra for peptide identification.Proteomics. 2009 Nov;9(21):4978-84. doi: 10.1002/pmic.200900326. Proteomics. 2009. PMID: 19743429
-
Filtering of MS/MS data for peptide identification.BMC Genomics. 2013;14 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2164-14-S7-S2. Epub 2013 Nov 5. BMC Genomics. 2013. PMID: 24564329 Free PMC article.
-
Filtering strategies for improving protein identification in high-throughput MS/MS studies.Proteomics. 2009 Feb;9(4):848-60. doi: 10.1002/pmic.200800517. Proteomics. 2009. PMID: 19160393 Review.
-
Lessons in de novo peptide sequencing by tandem mass spectrometry.Mass Spectrom Rev. 2015 Jan-Feb;34(1):43-63. doi: 10.1002/mas.21406. Mass Spectrom Rev. 2015. PMID: 25667941 Free PMC article. Review.
Cited by
-
Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey.IEEE Access. 2021;9:5497-5516. doi: 10.1109/ACCESS.2020.3047588. Epub 2020 Dec 25. IEEE Access. 2021. PMID: 33537181 Free PMC article.
-
An Out-of-Core GPU based dimensionality reduction algorithm for Big Mass Spectrometry Data and its application in bottom-up Proteomics.ACM BCB. 2017 Aug;2017:550-555. doi: 10.1145/3107411.3107466. ACM BCB. 2017. PMID: 28868521 Free PMC article.
-
The impact of noise and missing fragmentation cleavages on de novo peptide identification algorithms.Comput Struct Biotechnol J. 2022 Mar 19;20:1402-1412. doi: 10.1016/j.csbj.2022.03.008. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35386104 Free PMC article.
-
Peppy: proteogenomic search software.J Proteome Res. 2013 Jun 7;12(6):3019-25. doi: 10.1021/pr400208w. Epub 2013 May 6. J Proteome Res. 2013. PMID: 23614390 Free PMC article.
-
Improving Spectral Similarity and Molecular Network Reliability through Noise Signal Filtering in MS/MS Spectra.Anal Chem. 2025 Jul 29;97(29):15873-15882. doi: 10.1021/acs.analchem.5c02109. Epub 2025 Jul 17. Anal Chem. 2025. PMID: 40673560 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources