Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 28;24(13):10761.
doi: 10.3390/ijms241310761.

Predicting Structural Susceptibility of Proteins to Proteolytic Processing

Affiliations

Predicting Structural Susceptibility of Proteins to Proteolytic Processing

Evgenii V Matveev et al. Int J Mol Sci. .

Abstract

The importance of 3D protein structure in proteolytic processing is well known. However, despite the plethora of existing methods for predicting proteolytic sites, only a few of them utilize the structural features of potential substrates as predictors. Moreover, to our knowledge, there is currently no method available for predicting the structural susceptibility of protein regions to proteolysis. We developed such a method using data from CutDB, a database that contains experimentally verified proteolytic events. For prediction, we utilized structural features that have been shown to influence proteolysis in earlier studies, such as solvent accessibility, secondary structure, and temperature factor. Additionally, we introduced new structural features, including length of protruded loops and flexibility of protein termini. To maximize the prediction quality of the method, we carefully curated the training set, selected an appropriate machine learning method, and sampled negative examples to determine the optimal positive-to-negative class size ratio. We demonstrated that combining our method with models of protease primary specificity can outperform existing bioinformatics methods for the prediction of proteolytic sites. We also discussed the possibility of utilizing this method for bioinformatics prediction of other post-translational modifications.

Keywords: protease substrates; proteases; regulatory proteolysis; substrate identification.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
(A) List of structural features used in the method, along with examples of their distribution along the protein polypeptide chain visualized within the substrate structures. The color bar represents a color scale ranging from 0 to 1, indicating numerical features such as solvent accessibility, temperature factor, and loop length, as well as binary features such as terminal regions. The secondary structure is visualized using a different color scheme: helices are shown in green, beta strands in light blue, and loops in yellow. (B) Prediction quality, measured using the Area Under the ROC Curve (AUC), of various machine learning methods calculated via cross-validation using the training set of CutDB proteolytic events mapped onto PDB structures. Negative class examples were sampled to achieve a 1:1 positive-to-negative class size ratio. (C) Dependence of the method’s prediction quality on different positive-to-negative class ratios. (D) Visualization of the proteolytic susceptibility probabilities predicted by our method for the 3D structure of the protease substrate.
Figure 2
Figure 2
(A) Improvement in prediction quality of the method after extension of the training set using AlphaFold models. (B) Comparison of AlphaFold confidence score with solvent accessibility and loop length in predicting proteolytic sites.
Figure 3
Figure 3
(A) A schematic representation of combining the proteolytic susceptibility probabilities predicted by our 3D structure-based method with protease sequence specificity models for comparison with other proteolytic site prediction methods. (B) Comparison of prediction quality between our method combined with protease sequence specificity models and the Procleave method.

Similar articles

Cited by

References

    1. Barber K.W., Rinehart J. The ABCs of PTMs. Nat. Chem. Biol. 2018;14:188–192. doi: 10.1038/nchembio.2572. - DOI - PMC - PubMed
    1. Conibear A.C. Deciphering Protein Post-Translational Modifications Using Chemical Biology Tools. Nat. Rev. Chem. 2020;4:674–695. doi: 10.1038/s41570-020-00223-8. - DOI - PubMed
    1. López-Otín C., Bond J.S. Proteases: Multifunctional Enzymes in Life and Disease. J. Biol. Chem. 2008;283:30433–30437. doi: 10.1074/jbc.R800035200. - DOI - PMC - PubMed
    1. Turk B. Targeting Proteases: Successes, Failures and Future Prospects. Nat. Rev. Drug Discov. 2006;5:785–799. doi: 10.1038/nrd2092. - DOI - PubMed
    1. Ratnikov B.I., Cieplak P., Gramatikoff K., Pierce J., Eroshkin A., Igarashi Y., Kazanov M., Sun Q., Godzik A., Osterman A., et al. Basis for Substrate Recognition and Distinction by Matrix Metalloproteinases. Proc. Natl. Acad. Sci. USA. 2014;111:E4148–E4155. doi: 10.1073/pnas.1406134111. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources