Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 27;22(23):12835.
doi: 10.3390/ijms222312835.

Evaluation of Deep Neural Network ProSPr for Accurate Protein Distance Predictions on CASP14 Targets

Affiliations

Evaluation of Deep Neural Network ProSPr for Accurate Protein Distance Predictions on CASP14 Targets

Jacob Stern et al. Int J Mol Sci. .

Abstract

The field of protein structure prediction has recently been revolutionized through the introduction of deep learning. The current state-of-the-art tool AlphaFold2 can predict highly accurate structures; however, it has a prohibitively long inference time for applications that require the folding of hundreds of sequences. The prediction of protein structure annotations, such as amino acid distances, can be achieved at a higher speed with existing tools, such as the ProSPr network. Here, we report on important updates to the ProSPr network, its performance in the recent Critical Assessment of Techniques for Protein Structure Prediction (CASP14) competition, and an evaluation of its accuracy dependency on sequence length and multiple sequence alignment depth. We also provide a detailed description of the architecture and the training process, accompanied by reusable code. This work is anticipated to provide a solid foundation for the further development of protein distance prediction tools.

Keywords: CASP; ProSPr; alphafold; contact; dataset; deep learning; distance; prediction; protein; retrainable.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Two example targets from the CASP14 test set. Left: experimental structures from which labels were derived. Middle: contact maps predicted with ProSPr ensemble on top of the diagonal; label on bottom. Right: visualization of auxiliary loss predictions on top with labels at bottom. Accessible surface area (ASA), torsion angles (PHI, PSI), secondary structure (SS).
Figure 2
Figure 2
Left: correlation analysis of average accuracy (see text for definition) for CASP14 targets with MSA smaller than 400 sequences. Middle: correlation analysis for MSA deeper than 400 sequences. Right: correlation analysis of average accuracy and target amino acid sequence length.
Figure 3
Figure 3
ProSPr network architecture and model architecture.
Figure 4
Figure 4
Detailed view of ProSPr data pipeline. For training a protein structure in the pdb file format is used to create inputs and labels. For inference, a multiple sequence alignment in the a3m file format is expected.

References

    1. Della Corte D., van Beek H.L., Syberg F., Schallmey M., Tobola F., Cormann K.U., Schlicker C., Baumann P.T., Krumbach K., Sokolowsky S. Engineering and application of a biosensor with focused ligand specificity. Nat. Commun. 2020;11:1–11. doi: 10.1038/s41467-020-18400-0. - DOI - PMC - PubMed
    1. Morris C.J., Corte D.D. Using molecular docking and molecular dynamics to investigate protein-ligand interactions. Mod. Phys. Lett. B. 2021;35:2130002. doi: 10.1142/S0217984921300027. - DOI
    1. Coates T.L., Young N., Jarrett A.J., Morris C.J., Moody J.D., Corte D.D. Current computational methods for enzyme design. Mod. Phys. Lett. B. 2021;35:2150155. doi: 10.1142/S0217984921501554. - DOI
    1. Möckel C., Kubiak J., Schillinger O., Kühnemuth R., Della Corte D., Schröder G.F., Willbold D., Strodel B., Seidel C.A., Neudecker P. Integrated NMR, fluorescence, and molecular dynamics benchmark study of protein mechanics and hydrodynamics. J. Phys. Chem. B. 2018;123:1453–1480. doi: 10.1021/acs.jpcb.8b08903. - DOI - PubMed
    1. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. - DOI - PMC - PubMed

Publication types

LinkOut - more resources