Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010 Sep 1;26(17):2153-9.
doi: 10.1093/bioinformatics/btq341. Epub 2010 Jul 22.

Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry

Affiliations
Comparative Study

Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry

David M A Martin et al. Bioinformatics. .

Abstract

Motivation: Complex patterns of protein phosphorylation mediate many cellular processes. Tandem mass spectrometry (MS/MS) is a powerful tool for identifying these post-translational modifications. In high-throughput experiments, mass spectrometry database search engines, such as MASCOT provide a ranked list of peptide identifications based on hundreds of thousands of MS/MS spectra obtained in a mass spectrometry experiment. These search results are not in themselves sufficient for confident assignment of phosphorylation sites as identification of characteristic mass differences requires time-consuming manual assessment of the spectra by an experienced analyst. The time required for manual assessment has previously rendered high-throughput confident assignment of phosphorylation sites challenging.

Results: We have developed a knowledge base of criteria, which replicate expert assessment, allowing more than half of cases to be automatically validated and site assignments verified with a high degree of confidence. This was assessed by comparing automated spectral interpretation with careful manual examination of the assignments for 501 peptides above the 1% false discovery rate (FDR) threshold corresponding to 259 putative phosphorylation sites in 74 proteins of the Trypanosoma brucei proteome. Despite this stringent approach, we are able to validate 80 of the 91 phosphorylation sites (88%) positively identified by manual examination of the spectra used for the MASCOT searches with a FDR < 15%.

Conclusions: High-throughput computational analysis can provide a viable second stage validation of primary mass spectrometry database search results. Such validation gives rapid access to a systems level overview of protein phosphorylation in the experiment under investigation.

Availability: A GPL licensed software implementation in Perl for analysis and spectrum annotation is available in the supplementary material and a web server can be assessed online at http://www.compbio.dundee.ac.uk/prophossi.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
An example of misidentification of the correct phosphorylation site. MASCOT identifies the phosphorylation site as pS16 (peptide score 118), though the expected y5-98 ion is much weaker than the weak y5 ion (blue). The second ranked hit (pY17, score 102) is preferred by the experienced analyst with a strong y5 ion match (red) giving a continuous y-ion ladder. The spectrum was annotated with Prophossi and modified. The threshold for ion inclusion is indicated by a blue bar on the y-axis.
Fig. 2.
Fig. 2.
Workflow for automated annotation of phosphosites. Experimental LC–MS/MS data is gathered (1) and processed using platform specific software (2) to give a generic peak list file (3). This file, is used as the input to MASCOT (4), which generates a results file (5) containing all the PSMs. This file is parsed into the MLRV relational database (6) and the FDR for the search determined (7). A PSM-quality prefilter is applied (8) and suitable PSMs are exported to the TryPP-DB (9) where they are linked to the source peak list file used for the search. The observed MS/MS spectrum is extracted from the peak list file (10), filtered by an intensity threshold (11) and compared with a calculated fragmentation spectrum (12) for the peptide under examination. Observed ions are assigned to series (13) allowing the curation rules to be applied (14).
Fig. 3.
Fig. 3.
All manually curated peptide–spectrum matches containing at least one phosphorylated residue were ordered according to their MASCOT (red), SEQUEST (black) or SEQUEST + Peptide Prophet (green) score. Dotted lines indicate the performance of all matches ordered by search engine score. Solid lines indicate the performance of the subset of matches positively curated by ProPhosSI. The increased area under the curve for the solid lines indicates better performance by ProPhosSI.
Fig. 4.
Fig. 4.
A manually verified PSM that ProPhosSI fails to validate. Many ion labels are not shown for clarity. Evidence for phosphorylation at S1 arises from the b2 ion (a). ProPhosSI requires ion transitions over a phosphosite and so requires more than one ion. Evidence for phosphorylation at S4 arises from the uniquely assigned des-phospho y10 [2+] ion. ProPhosSI does not consider 2+ ions as they can in many cases be assigned to more than one fragment.

Similar articles

Cited by

References

    1. Andersson L, Porath J. Isolation of phosphoproteins by immobilized metal (Fe3+) affinity chromatography. Anal. Biochem. 1986;154:250–254. - PubMed
    1. Beausoleil SA, et al. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 2006;24:1285–1292. - PubMed
    1. Breci LA, et al. Cleavage N-terminal to proline: analysis of a database of peptide tandem mass spectra. Anal. Chem. 2003;75:1963–1971. - PubMed
    1. Cohen P. The regulation of protein function by multisite phosphorylation—a 25 year update. Trends Biochem. Sci. 2000;25:596–601. - PubMed
    1. Cox J, Mann M. Is proteomics the new genomics? Cell. 2007;130:395–398. - PubMed

Publication types

Substances