Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul;33(7):743-9.
doi: 10.1038/nbt.3267. Epub 2015 Jun 15.

A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides

Affiliations

A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides

Joel M Chick et al. Nat Biotechnol. 2015 Jul.

Erratum in

Abstract

Fewer than half of all tandem mass spectrometry (MS/MS) spectra acquired in shotgun proteomics experiments are typically matched to a peptide with high confidence. Here we determine the identity of unassigned peptides using an ultra-tolerant Sequest database search that allows peptide matching even with modifications of unknown masses up to ± 500 Da. In a proteome-wide data set on HEK293 cells (9,513 proteins and 396,736 peptides), this approach matched an additional 184,000 modified peptides, which were linked to biological and chemical modifications representing 523 distinct mass bins, including phosphorylation, glycosylation and methylation. We localized all unknown modification masses to specific regions within a peptide. Known modifications were assigned to the correct amino acids with frequencies >90%. We conclude that at least one-third of unassigned spectra arise from peptides with substoichiometric modifications.

PubMed Disclaimer

Figures

Figure 1
Figure 1. A very wide precursor ion (Open) search setting identified ~185,000 modified peptides
A) Most Unimod.org-reported modifications change precursor masses by < 500 Da. B) The vast majority of unmodified peptides are re-identified from Open searches. MS/MS spectra from mouse brain peptides analyzed in triplicate were searched using the Sequest algorithm and varying only the precursor ion tolerance. Note that fragment ion tolerance remained very strict (0.01 Da). At ±500 Da, 86% of unmodified peptides matched from an accurate mass search (5 ppm or ~.005 Da) were still assigned at a 1% FDR. C) A proteome-wide dataset was collected by LC-MS/MS from trypsinized and fractionated HEK293 cell lysate and assessed through either an Open or Closed search. D) The Open search identified more than 184,000 peptides with modified ΔM (mass change) values between −500 and +500 Da. E) ΔM distribution for 510,139 peptides. In addition to the 325,157 unmodified peptides, the 184,982 modified peptides distributed based on the exact net mass change of their modification. The inset shows a zoomed in view of ~2000 phosphorylated peptides. F–I) Comparison of identical modified peptides matched using a directed Closed search (where the modification was specified as differential) with an Open (±500-Da) search for four known peptide modifications: oxidation, carbamoylation, phosphorylation and deamidation.
Figure 2
Figure 2. Averaging many independent events provided very accurate net modification mass differences (sub-PPM)
A–L) Examples of residual mass bins from Figure 1C showing the frequency of peptide identifications and the average mass difference. Bins are drawn at 0.001 Da intervals. The deduced modification is also shown.
Figure 3
Figure 3. Many negative Δmass peptides are generated via in-source dissociation
A) In-source dissociation is observed as the co-elution of two related ions, one being the precursor ion and the second being the same precursor with a mass loss due to fragmentation. These two related ions once fragmented and recorded in MS2 spectra share fragment ion masses. B) MS/MS spectrum for the intact precursor from Panel A. C) MS/MS spectrum for the in-source fragmented peptide. This peptide is only matched in the Open search. In-source dissociation events are observed in 500-Da searches as amino acid losses from one terminus of the peptide. In this example, all b-type ions (shown with an asterisk) were not matched in the 500-Da search, but nearly all y-type ions were, identifying the peptide. D) With accurate mass measurements in the MS2 scans, the fragment ions were measured at low part-per-million accuracy.
Figure 4
Figure 4. Analysis of ~185,000 peptides provided insights into rare biological modifications and amino acid variants/variations
In each example, as shown in the inset, the modified form was detected more frequently than the matching unmodified peptide form. A) Glycerol phosphorylethanolamine modification of glutamate residue 301 in Elongation factor 1a2 was identified with hundreds of spectral counts. B) A diphthalamide modification was identified in position H715 for Elongation factor 2. C) A tele-methylhistidine modification was identified on cytoplasmic actin, ACTB. D) Nuceloplasmin was identified with two modifications, a phosphorylation event at S135 and glutamyl modifications at E136. The glutamylation event was only identified when phosphorylation was present giving a Δmass value of +209.0089 Da. E) One example from numerous amino acid variations identified in 293 cell Open searches. Succinate dehydrogenase with a V657I mutation is shown. F) Complex variations were also identified using the open search. For example, several alanine residues were added to the Ribosomal Protein L14 at position 159. These insertions numbered from as low as three to as many as six alanines.

Comment in

References

    1. Washburn MP, Wolters D, Yates JR., 3rd Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol. 2001;19:242–247. - PubMed
    1. Wolters DA, Washburn MP, Yates JR., 3rd An automated multidimensional protein identification technology for shotgun proteomics. Anal Chem. 2001;73:5683–5690. - PubMed
    1. Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5:976–89. - PubMed
    1. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999;20:3551–3567. - PubMed
    1. Beck M, et al. The quantitative proteome of a human cell line. Mol Syst Biol. 2011;7:549. - PMC - PubMed

Publication types