Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov;20(21-22):e1900351.
doi: 10.1002/pmic.201900351. Epub 2020 Jun 25.

The Age of Data-Driven Proteomics: How Machine Learning Enables Novel Workflows

Affiliations

The Age of Data-Driven Proteomics: How Machine Learning Enables Novel Workflows

Robbin Bouwmeester et al. Proteomics. 2020 Nov.

Abstract

A lot of energy in the field of proteomics is dedicated to the application of challenging experimental workflows, which include metaproteomics, proteogenomics, data independent acquisition (DIA), non-specific proteolysis, immunopeptidomics, and open modification searches. These workflows are all challenging because of ambiguity in the identification stage; they either expand the search space and thus increase the ambiguity of identifications, or, in the case of DIA, they generate data that is inherently more ambiguous. In this context, machine learning-based predictive models are now generating considerable excitement in the field of proteomics because these predictive models hold great potential to drastically reduce the ambiguity in the identification process of the above-mentioned workflows. Indeed, the field has already produced classical machine learning and deep learning models to predict almost every aspect of a liquid chromatography-mass spectrometry (LC-MS) experiment. Yet despite all the excitement, thorough integration of predictive models in these challenging LC-MS workflows is still limited, and further improvements to the modeling and validation procedures can still be made. Therefore, highly promising recent machine learning developments in proteomics are pointed out in this viewpoint, alongside some of the remaining challenges.

Keywords: data driven modeling; deep learning; machine learning.

PubMed Disclaimer

References

    1. R. Aebersold, M. Mann, Nature 2003, 422, 198.
    1. P. Lössl, M. Waterbeemd, A. J. R. Heck, EMBO J. 2016, 35, 2634.
    1. J. Griss, Y. Perez-Riverol, S. Lewis, D. L. Tabb, J. A. Dianes, N. Del-Toro, M. Rurik, M. W. Walzer, O. Kohlbacher, H. Hermjakob, R. Wang, J. A. Vizcaíno, Nat. Methods 2016, 13, 651.
    1. J. M. Chick, D. Kolippakkam, D. P. Nusinow, B. Zhai, R. Rad, E. L. Huttlin, S. P. Gygi, Nat. Biotechnol. 2015, 33, 743.
    1. A. I. Nesvizhskii, F. F. Roos, J. Grossmann, M. Vogelzang, J. S. Eddes, W. Gruissem, S. Baginsky, R. Aebersold, Mol. Cell. Proteomics 2006, 5, 652.

Publication types

LinkOut - more resources