Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Oct 15;10 Suppl 12(Suppl 12):S11.
doi: 10.1186/1471-2105-10-S12-S11.

A Perl procedure for protein identification by Peptide Mass Fingerprinting

Affiliations

A Perl procedure for protein identification by Peptide Mass Fingerprinting

Alessandra Tiengo et al. BMC Bioinformatics. .

Abstract

Background: One of the topics of major interest in proteomics is protein identification. Protein identification can be achieved by analyzing the mass spectrum of a protein sample through different approaches. One of them, called Peptide Mass Fingerprinting (PMF), combines mass spectrometry (MS) data with searching strategies in a suitable database of known protein to provide a list of candidate proteins ranked by a score. To this aim, several algorithms and software tools have been proposed. However, the scoring methods and mainly the statistical evaluation of the results can be significantly improved.

Results: In this work, a Perl procedure for protein identification by PMF, called MsPI (Mass spectrometry Protein Identification), is presented. The implemented scoring methods were derived from the literature. MsPI implements a strategy to remove the contaminant masses present in the acquired spectra. Moreover, MsPI includes a statistical method to assign to each candidate protein, in addition to the scoring value, a p-value. Results obtained by MsPI on a dataset of 10 protein samples were compared with those achieved using two other software tools, i.e. Piums and Mascot. Piums implements one of the scoring methods available in MsPI, while Mascot is one of the most frequently used software tools in the protein identification field. MsPI scripts are available for downloading on the web site http://aimed11.unipv.it/MsPI.

Conclusion: The performances of MsPI seem to be better than those of Piums and Mascot. In fact, on the considered dataset, MsPI includes in its candidate proteins list, the "true" proteins nine times over ten, whereas Piums includes in its list the "true" proteins only four time over ten. Even if Mascot also correctly includes in the candidates list the "true" proteins nine times over ten, it provides longer candidate lists, therefore increasing the number of false positives when the molecular weight of the proteins in the sample is approximatively known (e.g. by the 1-D/2-D electrophoresis gel). Moreover, being MsPI a Perl tool, it can be easily extended and customized by the final users.

PubMed Disclaimer

Figures

Figure 1
Figure 1
PMF consists of three steps. (1) The preparation of the biological sample: a band or a spot of the electrophoretic gel is selected and digested by a suitable protease, such as trypsin. The resulting mixture of peptides is analyzed with a mass spectrometer, usually in MALDI-TOF configuration. (2) A reference protein database is created, reproducing in silico on a set of known proteins the step 1, considering also possible missed cleavages and post-translational modifications. (3) The acquired spectrum is matched against the theoretical spectra generated by in silico digestion of all proteins in the reference database (step 2) and a ranked list of candidate proteins is obtained.
Figure 2
Figure 2
The directory structure created during the installation of the MsPI tool.

Similar articles

Cited by

References

    1. Elias JE, Haas W, Faherty BK, Gygi SP. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nature Methods. 2005;2:667–675. doi: 10.1038/nmeth785. - DOI - PubMed
    1. Shadforth I, Crowther D, Bessant C. Protein and peptide identification algorithms using MS for use in high-throughput, automated pipelines. Proteomics. 2005;5:4082–4095. doi: 10.1002/pmic.200402091. - DOI - PubMed
    1. Hernandez P, Muller M, Appel RD. Automated protein identification by tandem mass spectrometry: issues and strategies. Mass Spectrom Rev. 2006;25:235–254. doi: 10.1002/mas.20068. - DOI - PubMed
    1. Downard K. Mass Spectrometry: A Foundation Course. Royal Society of Chemistry; 2004.
    1. Barbarini N, Magni P, Bellazzi R. A new approach for the analysis of mass spectrometry data for biomarker discovery. Annual Symposium of the American Medical Informatics Association (AMIA 2006) 2006;1:26–30. - PMC - PubMed

Publication types

LinkOut - more resources