Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jan 22:18:1301-1310.
doi: 10.1016/j.csbj.2019.12.011. eCollection 2020.

Deep learning methods in protein structure prediction

Affiliations
Review

Deep learning methods in protein structure prediction

Mirko Torrisi et al. Comput Struct Biotechnol J. .

Abstract

Protein Structure Prediction is a central topic in Structural Bioinformatics. Since the '60s statistical methods, followed by increasingly complex Machine Learning and recently Deep Learning methods, have been employed to predict protein structural information at various levels of detail. In this review, we briefly introduce the problem of protein structure prediction and essential elements of Deep Learning (such as Convolutional Neural Networks, Recurrent Neural Networks and basic feed-forward Neural Networks they are founded on), after which we discuss the evolution of predictive methods for one-dimensional and two-dimensional Protein Structure Annotations, from the simple statistical methods of the early days, to the computationally intensive highly-sophisticated Deep Learning algorithms of the last decade. In the process, we review the growth of the databases these algorithms are based on, and how this has impacted our ability to leverage knowledge about evolution and co-evolution to achieve improved predictions. We conclude this review outlining the current role of Deep Learning techniques within the wider pipelines to predict protein structures and trying to anticipate what challenges and opportunities may arise next.

Keywords: Deep learning; Machine learning; Protein structure prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
A generic pipeline for ab initio Protein Structure Prediction, in which evolutionary information in the form of alignments, 1D and 2D PSA are intermediate steps.
Fig. 2
Fig. 2
Growth of known structures in the Protein Data Bank (left) and known sequences in Uniprot (right). The y-axis is shown in logarithmic scale for the Uniprot.
Fig. 3
Fig. 3
Performances of secondary structure predictors over the years. “stat” are predictors based on statistical methods other than Neural Networks. “ML” are predictors based on shallow Neural Networks or Support Vector Machines. “DL-CNN” are Deep Learning methods based on Convolutional Neural Networks. “DL-RNN” are Deep Learning methods based on Recurrent Neural Networks. Data extracted from accompanying publications of predictors referenced in this article.
Fig. 4
Fig. 4
Improvements in quality of 3D predictions for free modelling (ab initio) targets between CASP9 and CASP13.

References

    1. Kendrew J.C., Dickerson R.E., Strandberg B.E., Hart R.G., Davies D.R., Phillips D.C., Shore V.C. Structure of myoglobin: a three-dimensional Fourier synthesis at 2 A. resolution. Nature. 1960;185:422–427. - PubMed
    1. Perutz M.F., Rossmann M.G., Cullis A.F., Muirhead H., Will G., North A.C. Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5-A. resolution, obtained by X-ray analysis. Nature. 1960;185:416–422. - PubMed
    1. Fleishman S.J., Whitehead T.A., Ekiert D.C., Dreyfus C., Corn J.E., Strauch E.-M., Wilson I.A., Baker D. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science. 2011;332:816–821. - PMC - PubMed
    1. Siegel J.B., Zanghellini A., Lovick H.M., Kiss G., Lambert A.R., St.Clair J.L., Gallaher J.L., Hilvert D., Gelb M.H., Stoddard B.L., Houk K.N., Michael F.E., Baker D. Computational design of an enzyme catalyst for a stereoselective bimolecular diels-alder reaction. Science. 2010;329:309–313. - PMC - PubMed
    1. Kuhlman B., Dantas G., Ireton G.C., Varani G., Stoddard B.L., Baker D. Design of a Novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364–1368. - PubMed

LinkOut - more resources