Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 1;18(2):715-720.
doi: 10.1021/acs.jproteome.8b00728. Epub 2018 Dec 17.

DeltaMass: Automated Detection and Visualization of Mass Shifts in Proteomic Open-Search Results

Affiliations

DeltaMass: Automated Detection and Visualization of Mass Shifts in Proteomic Open-Search Results

Dmitry M Avtonomov et al. J Proteome Res. .

Abstract

Routine identification of thousands of proteins in a single LC-MS experiment has long become the norm. With these vast amounts of data, more rigorous treatment of modified forms of peptides becomes possible. "Open search", a protein database search with a large precursor ion mass tolerance window, is becoming a popular method to evaluate possible sets of post-translational and chemical modifications in samples. The extraction of statistical information about the modification from peptide search results requires additional effort and data processing, such as recalibration of masses and accurate detection of precursors in MS1 signals. Here we present a software tool, DeltaMass, which performs kernel-density-based estimation of observed mass shifts and allows for the detection of poorly resolved mass deltas. The software also maps observed mass shifts to known modifications from public databases such as UniMod and augments them with additionally generated possible chemical changes to the molecule. Its interactive graphical interface provides an effective option for the visual interrogation of the data and the identification of potentially interesting mass shifts or unusual artifacts for subsequent analysis. However, the program can also be used in fully automated command-line mode to generate mass-shift peak lists as well.

Keywords: GUI; chemical modification; data visualization; kernel density; mass shift; open search; peak detection; post-translational modification; proteomics.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Full profile of delta masses in the range −200 to 200 Da in the data set used in Chick et. al., Proteome Exchange id PXD001468. Many well defined peaks have no known mappings in UniMod.
Figure 2:
Figure 2:
User interface for starting DeltaMass implements the major part of options available through the command line.
Figure 3:
Figure 3:
(Left) Kernel density estimate of mass shifts around +1 Da obtained from 24 LC-MS files of Proteome Exchange data set PXD001468. A small bulge near shift of +0.98 Da can be discerned, corresponding to deamidation, but the shape of KDE is very noisy overall. (Middle) Zero-peak correction applied. The KDE now looks like a mixture of two Normal distributions. (Right) Mass recalibration using identified peptides applied to the same data set.
Figure 4:
Figure 4:
Three histograms of PSM mass shifts with bins of different widths (0.005, 0.001, 0.0002 Da). Kernel density estimate (KDE) is shown as red line. KDE bandwidth, which is roughly analogous to the bin size for histograms, was selected automatically without user intervention. The green histogram (0.001 bin width) looks to be the best fit for data, however this does not yet take into account the problem of selecting the histogram bin offset. Selecting too small a bin width leads to the histogram falling apart (blue). Select too large and loose the accuracy (red). Even for the best one - green - the peak’s location could not be determined to the same level of precision as with KDE.
Figure 5:
Figure 5:
Visual guide to estimating peak width in your data. Vertical lines on the plots mark locations of known modifications. Horizontal scale is the same in both panels. Left panel shows the Oxidation peak, which has no interfering modifications near the main +15.9949 peak. While there are other known mass shifts, they are quite far away and are very uncommon compared to Oxidation. For comparison the +18 region is shown in the right panel. That region corresponds to lots of various aminoacid substitutions, addition of Ammonium and others.
Figure 6:
Figure 6:
Peaks inferred from the kernel density function plotted over the KDE itself. Peak height represents significance of the peak within the nominal mass region.
Figure 7:
Figure 7:
Interactive kernel density function viewer, displaying PSMs and known modifications. PSMs from the selected region (highlighted) are displayed in the table, known modification information is shown on the bottom left. Detected peaks are rendered as purple dotted lines and teal dashed lines running vertically denote locations of known modifications. In this example it can be observed that many sequences have Phenylalanine (symbol ‘F’, monoisotopic mass 147.0684) in the first position suggesting that this peak is likely due to in-source fragmentation. Even though loss of Phe is not listed in UniMod, it is reported by the included automatic annotation generator.
Figure 8:
Figure 8:
(Top left) Looking at the general KDE profile the user might notice something interesting, like double peaks. (Bottom left) In the viewer the user can select the region of interest and see that the right-leaning peaks do have some possible known mappings - teal dotted vertical lines show known mass shifts. (Right half) The user can then select the left leaning peak, which has no mappings, and the viewer displays PSMs in the table. In this example many of the selected sequences start with ‘Q’. That can be one possibility for subsequent investigation. It can also be noticed that there is another larger peak to the left and that the peaks spaced by 1 Da decrease in intensity, suggesting that this might be attributed to C12/C13 isotope selection errors by the instrument.

References

    1. Mann M; Wilm M Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Analytical chemistry 1994, 66, 4390–4399. - PubMed
    1. Dančík V; Addona TA; Clauser KR; Vath JE; Pevzner PA De novo peptide sequencing via tandem mass spectrometry. Journal of computational biology 1999, 6, 327–342. - PubMed
    1. Tabb DL; Saraf A; Yates JR GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Analytical chemistry 2003, 75, 6415–6421. - PMC - PubMed
    1. Zhang N; Li X.-j.; Ye M; Pan S; Schwikowski B; Aebersold R ProbIDtree: An automated software program capable of identifying multiple peptides from a single collision-induced dissociation spectrum collected by a tandem mass spectrometer. Proteomics 2005, 5, 4096–4106. - PubMed
    1. Shilov IV; Seymour SL; Patel AA; Loboda A; Tang WH; Keating SP; Hunter CL; Nuwaysir LM; Schaeffer DA The Paragon Algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra. Molecular & Cellular Proteomics 2007, 6, 1638–1655. - PubMed

Publication types