Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Nov 11:20:6271-6286.
doi: 10.1016/j.csbj.2022.11.012. eCollection 2022.

Deep learning for protein secondary structure prediction: Pre and post-AlphaFold

Affiliations
Review

Deep learning for protein secondary structure prediction: Pre and post-AlphaFold

Dewi Pramudi Ismi et al. Comput Struct Biotechnol J. .

Abstract

This paper aims to provide a comprehensive review of the trends and challenges of deep neural networks for protein secondary structure prediction (PSSP). In recent years, deep neural networks have become the primary method for protein secondary structure prediction. Previous studies showed that deep neural networks had uplifted the accuracy of three-state secondary structure prediction to more than 80%. Favored deep learning methods, such as convolutional neural networks, recurrent neural networks, inception networks, and graph neural networks, have been implemented in protein secondary structure prediction. Methods adapted from natural language processing (NLP) and computer vision are also employed, including attention mechanism, ResNet, and U-shape networks. In the post-AlphaFold era, PSSP studies focus on different objectives, such as enhancing the quality of evolutionary information and exploiting protein language models as the PSSP input. The recent trend to utilize pre-trained language models as input features for secondary structure prediction provides a new direction for PSSP studies. Moreover, the state-of-the-art accuracy achieved by previous PSSP models is still below its theoretical limit. There are still rooms for improvement to be made in the field.

Keywords: Deep neural networks; Machine learning; Prediction accuracy; Protein; Protein secondary structure prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
A PSSP model takes sequences of amino acids as an input and produces the sequences of corresponding secondary structure elements.
Fig. 2
Fig. 2
The general framework of PSSP models: two phases in model development, namely training and evaluation. The training dataset is used to build the model, while the test dataset is used to confirm the performance of the trained model.
Fig. 3
Fig. 3
One hot encoding of amino acids and position specific scoring matrix (PSSM).
Fig. 4
Fig. 4
Deep learning methods and the number of PSSP works employing them during 2016–2021.

Similar articles

Cited by

References

    1. Breda A., Valadares N.F., de Souza O.N., Garratt R.C. In: Gruber A., Durham A.M., Huynh C., del Portillo H.A., editors. Ch. A06. National Center for Biotechnology Information (US); Bethesda (MD): 2008. Protein structure, modelling and applications; pp. 137–170. (Bioinformatics in tropical disease research: a practical and case-study approach).
    1. Branden C.I., Tooze J. Introduction to Protein Structure. Garland Sci. 2012 doi: 10.1201/9781136969898. - DOI
    1. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., Bridgland A., Meyer C., Kohl S.A.A., Ballard A.J., Cowie A., Romera-Paredes B., Nikolov S., Jain R., Adler J., Back T., Petersen S., Reiman D., Clancy E., Zielinski M., Steinegger M., Pacholska M., Berghammer T., Bodenstein S., Silver D., Vinyals O., Senior A.W., Kavukcuoglu K., Kohli P., Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–589. doi: 10.1038/s41586-021-03819-2. - DOI - PMC - PubMed
    1. Gibson K.D., Scheraga H.A. Minimization of polypeptide energy. I. Preliminary structures of bovine pancreatic ribonuclease s-peptide. Proc Natl Acad Sci. 1967;58(2):420–427. doi: 10.1073/pnas.58.2.420. - DOI - PMC - PubMed
    1. Levitt M. Protein folding by restrained energy minimization and molecular dynamics. J Mol Biol. 1983;170(3):723–764. doi: 10.1016/S0022-2836(83)80129-6. - DOI - PubMed

LinkOut - more resources