. 2015 Sep;14(9):2331-40.

doi: 10.1074/mcp.M115.051300. Epub 2015 Jun 22.

Using Data Independent Acquisition (DIA) to Model High-responding Peptides for Targeted Proteomics Experiments

Brian C Searle¹, Jarrett D Egertson², James G Bollinger², Andrew B Stergachis², Michael J MacCoss³

Affiliations

¹ From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195; §Proteome Software Inc., Portland, OR 97219.
² From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195;
³ From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195; maccoss@uw.edu.

PMID: 26100116
PMCID: PMC4563719
DOI: 10.1074/mcp.M115.051300

Using Data Independent Acquisition (DIA) to Model High-responding Peptides for Targeted Proteomics Experiments

Brian C Searle et al. Mol Cell Proteomics. 2015 Sep.

. 2015 Sep;14(9):2331-40.

doi: 10.1074/mcp.M115.051300. Epub 2015 Jun 22.

Authors

Brian C Searle¹, Jarrett D Egertson², James G Bollinger², Andrew B Stergachis², Michael J MacCoss³

Affiliations

¹ From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195; §Proteome Software Inc., Portland, OR 97219.
² From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195;
³ From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington 98195; maccoss@uw.edu.

PMID: 26100116
PMCID: PMC4563719
DOI: 10.1074/mcp.M115.051300

Abstract

Targeted mass spectrometry is an essential tool for detecting quantitative changes in low abundant proteins throughout the proteome. Although selected reaction monitoring (SRM) is the preferred method for quantifying peptides in complex samples, the process of designing SRM assays is laborious. Peptides have widely varying signal responses dictated by sequence-specific physiochemical properties; one major challenge is in selecting representative peptides to target as a proxy for protein abundance. Here we present PREGO, a software tool that predicts high-responding peptides for SRM experiments. PREGO predicts peptide responses with an artificial neural network trained using 11 minimally redundant, maximally relevant properties. Crucial to its success, PREGO is trained using fragment ion intensities of equimolar synthetic peptides extracted from data independent acquisition experiments. Because of similarities in instrumentation and the nature of data collection, relative peptide responses from data independent acquisition experiments are a suitable substitute for SRM experiments because they both make quantitative measurements from integrated fragment ion chromatograms. Using an SRM experiment containing 12,973 peptides from 724 synthetic proteins, PREGO exhibits a 40-85% improvement over previously published approaches at selecting high-responding peptides. These results also represent a dramatic improvement over the rules-based peptide selection approaches commonly used in the literature.

PubMed Disclaimer

Figures

**Fig. 1.**
**A histogram of the dynamic ranges calculated for 724 proteins.** The dynamic range is estimated as the number of orders of magnitude separation for each protein. This value is calculated as the difference between the log₁₀ intensities of the highest responding peptide and the lowest responding peptide. The median dynamic range is 3.4 orders of magnitude, with an interquartile range of 1.2 orders. All protein intensity data was drawn from the Stergachis *et al.* SRM testing data set.

**Fig. 2.**
**Algorithmic outline of the PREGO method.** A, Algorithmic outline describing feature selection using an mRMR style algorithm to identify nonredundant features with maximum relevance. Feature sets with low redundancy often decrease the potential for over-training in machine learning algorithms. B, Algorithmic outline for neural network construction using the mRMR-selected feature set. C, Testing of the algorithm was performed using the Stergachis *et al.* SRM testing data set.

**Fig. 3.**
**PREGO Scores for peptides in CASZ1.** Peptides in CASZ1 (also known as cDNA FLJ20321) are ranked on their experimentally acquired transition fragment intensity from the Stergachis *et al.* SRM testing data set where the peptide with the strongest response is awarded a rank of one. The top 20% of peptides by intensity rank are considered “high-responding peptides” and are shaded in blue. The top five peptides chosen by PREGO are marked with red borders. Although there is large variation in predicting response intensities for any given peptide (solid line), there is a definite trend (dashed line) to score first ranked peptides somewhat higher than worse ranked peptides. Consequently, the highest scoring peptides picked by PREGO are often also high-responding peptides. CASZ1 represents a “typical” protein with a correlation score of 0.65.

**Fig. 4.**
**Score distributions for four scoring methods by peptide rank.** A, The PREGO score distribution for peptides of descending rank across the entire Stergachis *et al.* SRM testing data set. The median ranks are annotated as dots, where the nearest-neighbor-smoothed trend is plotted as a black line. The interquartile range (Q1 to Q3) is shaded blue. In general, first ranked peptides with the highest responses tend to get higher scores than those of lower ranks, as indicated by the downward trend from left to right. The B, PPA score distribution as well as the CONSeQuence; C, artificial neural network (ANN); and D, support vector machine (SVM) score distributions all show weaker downward trends.

**Fig. 5.**
**Percentage of proteins with at least one high-responding peptide, given N peptides picked.** A, PREGO (blue), PPA (red), CONSeQuence artificial neural network (ANN, orange), and support vector machine (SVM, purple) machine learning-based scorers are compared with randomly guessing to select peptides (green) and the simple scoring function described in Equation 2 (cyan) based on common rules in the literature. Scorers are graded based on the likelihood that for any given protein, they could predict at least one high-responding peptide given N guesses. This is analogous to the strategy of picking N peptides to produce at least one useful peptide for each protein. For example, in Fig. 3 the top 1–5 peptides picked in CASZ1 have red borders and the high-responding peptides are shaded in blue. B, The same four learning-based scorers as a percentage improvement over rules based peptide selection. PREGO is dramatically better than the other approaches tested here at predicting high-responding peptides given five or fewer chances. All scoring data is based on the Stergachis *et al.* SRM testing data set.

See this image and copyright information in PMC

Cited by

SWATH-MS and MRM: Quantification of Ras-related proteins in HIV-1 infected and methamphetamine-exposed human monocyte-derived macrophages (hMDM).
Macur K, Zieschang S, Lei S, Morsey B, Jaquet S, Belshan M, Fox HS, Ciborowski P. Macur K, et al. Proteomics. 2021 Aug;21(15):e2100005. doi: 10.1002/pmic.202100005. Epub 2021 Jun 17. Proteomics. 2021. PMID: 34051048 Free PMC article.
A TRUSTED targeted mass spectrometry assay for pan-herpesvirus protein detection.
Kennedy MA, Tyl MD, Betsinger CN, Federspiel JD, Sheng X, Arbuckle JH, Kristie TM, Cristea IM. Kennedy MA, et al. Cell Rep. 2022 May 10;39(6):110810. doi: 10.1016/j.celrep.2022.110810. Cell Rep. 2022. PMID: 35545036 Free PMC article.
Insight on physicochemical properties governing peptide MS1 response in HPLC-ESI-MS/MS: A deep learning approach.
Abdul-Khalek N, Wimmer R, Overgaard MT, Gregersen Echers S. Abdul-Khalek N, et al. Comput Struct Biotechnol J. 2023 Jul 22;21:3715-3727. doi: 10.1016/j.csbj.2023.07.027. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37560124 Free PMC article.
A Skyline Plugin for Pathway-Centric Data Browsing.
Degan MG, Ryadinskiy L, Fujimoto GM, Wilkins CS, Lichti CF, Payne SH. Degan MG, et al. J Am Soc Mass Spectrom. 2016 Nov;27(11):1752-1757. doi: 10.1007/s13361-016-1448-3. Epub 2016 Aug 16. J Am Soc Mass Spectrom. 2016. PMID: 27530777
CIDer: A Statistical Framework for Interpreting Differences in CID and HCD Fragmentation.
Wilburn DB, Richards AL, Swaney DL, Searle BC. Wilburn DB, et al. J Proteome Res. 2021 Apr 2;20(4):1951-1965. doi: 10.1021/acs.jproteome.0c00964. Epub 2021 Mar 17. J Proteome Res. 2021. PMID: 33729787 Free PMC article.

See all "Cited by" articles

References

1. Marx V. (2013) Targeted proteomics. Nat. Methods 10, 19–22 - PubMed
1. Liebler D. C., Zimmerman L. J. (2013) Targeted quantitation of proteins by mass spectrometry. Biochemistry 52, 3797–3806 - PMC - PubMed
1. Picotti P., Aebersold R. (2012) Selected reaction monitoring-based proteomics: workflows, potential, pitfalls, and future directions. Nat. Methods 9, 555–566 - PubMed
1. Stergachis A. B., MacLean B., Lee K., Stamatoyannopoulos J. A., MacCoss M. J. (2011) Rapid empirical discovery of optimal peptides for targeted proteomics. Nat. Methods 8, 1041–1043 - PMC - PubMed
1. Bereman M. S., MacLean B., Tomazela D. M., Liebler D. C., MacCoss M.J. (2012) The development of selected reaction monitoring methods for targeted proteomics via empirical refinement. Proteomics 12, 1134–1141 - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using Data Independent Acquisition (DIA) to Model High-responding Peptides for Targeted Proteomics Experiments

Affiliations

Using Data Independent Acquisition (DIA) to Model High-responding Peptides for Targeted Proteomics Experiments

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources