Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Nov;20(21-22):e1900335.
doi: 10.1002/pmic.201900335. Epub 2020 Oct 30.

Deep Learning in Proteomics

Affiliations
Review

Deep Learning in Proteomics

Bo Wen et al. Proteomics. 2020 Nov.

Abstract

Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.

Keywords: bioinformatics; deep learning; proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Overview of the key components of deep learning and its applications in proteomics.
Figure 2
Figure 2
Brief network architectures of the deep learning tools for MS/MS spectrum prediction. FC layer refers to the fully connected layer, BiLSTM refers to bidirectional LSTM, and BiGRU refers to bidirectional GRU. For different models, metadata may include precursor charge state, precursor mass, collision energy, instrument type, etc. “∼” is the cleavage site.
Figure 3
Figure 3
From image captioning to DeepNovo. A) A typical neural network architecture of image captioning. B) The neural network architectures of DeepNovo and DeepNovo‐DIA.
Figure 4
Figure 4
Workflow and network architectures of common protein structure prediction methods. A) Schematic summary of contact‐guided structure prediction methods. Different methods may use different kinds of features and network architectures, but co‐evolutionary information is essential for good contact prediction. Contact or distance between residue pairs and other predicted geometry constraints are fed into various methods for structure modelling or converted to protein‐specific potentials for direct optimization. SS, secondary structure; SASA, solvent accessible surface area. B) End‐to‐end recurrent geometric network predicts structure without co‐evolutionary information. First two BiLSTM layers predict backbone torsion angles and second a geometric layer adds residues one by one to construct the structure using torsion angles and atoms in the last residue.

Similar articles

Cited by

References

    1. Kelchtermans P., Bittremieux W., De Grave K., Degroeve S., Ramon J., Laukens K., Valkenborg D., Barsnes H., Martens L., Proteomics 2014, 14, 353. - PubMed
    1. Bouwmeester R., Gabriels R., Van Den Bossche T., Martens L., Degroeve S., Proteomics 2020, e1900351. - PubMed
    1. Xu L. L., Young A., Zhou A., Rost H. L., Proteomics 2020, e1900352. - PubMed
    1. Ching T., Himmelstein D. S., Beaulieu‐Jones B. K., Kalinin A. A., Do B. T., Way G. P., Ferrero E., Agapow P.‐M., Zietz M., Hoffman M. M., Xie W., Rosen G. L., Lengerich B. J., Israeli J., Lanchantin J., Woloszynek S., Carpenter A. E., Shrikumar A., Xu J., Cofer E. M., Lavender C. A., Turaga S. C., Alexandari A. M., Lu Z., Harris D. J., DeCaprio D., Qi Y., Kundaje A., Peng Y., Wiley L. K., Segler M. H. S., Boca S. M., Swamidass S. J., Huang A., Gitter A., Greene C. S., J. R. Soc. Interface. 2018, 15, 20170387. - PMC - PubMed
    1. Cao C., Liu F., Tan H., Song D., Shu W., Li W., Zhou Y., Bo X., Xie Z., Genomics Proteomics Bioinf. 2018, 16, 17. - PMC - PubMed

Publication types

LinkOut - more resources