CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation
- PMID: 40483648
- DOI: 10.1007/s12539-025-00732-4
CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation
Abstract
Protein structures are fundamental to understanding their functions and interactions. With the continuous advancement of protein structure prediction methods, structure databases are rapidly expanding. Identifying the origin of protein structures is crucial for assessing the reliability of experimental resolution and computational prediction methods, as well as for guiding downstream biological research. Existing protein representation approaches often fail to capture subtle yet critical structural differences, posing challenges for precise structural traceability. To address this, we propose a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro), for the representation and origin evaluation of protein structures. CPE-Pro integrates a pre-trained protein Structural Sequence Language Model (SSLM) and Geometric Vector Perceptron-Graph Neural Network (GVP-GNN) to learn structure-aware protein representations and capture structural differences, enabling accurate classification across four origins of structural data. Preliminary results indicate that, compared to large-scale protein language models trained on extensive amino acid sequences, structural sequences enriched with local structural features enable the model to capture more informative protein characteristics, thereby enhancing and refining protein representations. Future research directions include extending the architecture to additional protein structure paradigms and developing evaluation methodologies for low-pLDDT predicted structures, providing more effective tools for protein structure analysis. The code, model weights, and all relevant materials are available at https://github.com/wr1102/CPE-Pro .
Keywords: Deep learning; Origin evaluation; Protein representation; Structural sequence.
© 2025. International Association of Scientists in the Interdisciplinary Areas.
Conflict of interest statement
Declarations. Conflict of interest: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Similar articles
-
GGN-GO: geometric graph networks for predicting protein function by multi-scale structure features.Brief Bioinform. 2024 Sep 23;25(6):bbae559. doi: 10.1093/bib/bbae559. Brief Bioinform. 2024. PMID: 39487084 Free PMC article.
-
Structure-aware protein self-supervised learning.Bioinformatics. 2023 Apr 3;39(4):btad189. doi: 10.1093/bioinformatics/btad189. Bioinformatics. 2023. PMID: 37052532 Free PMC article.
-
DSSGNN-PPI: A Protein-Protein Interactions prediction model based on Double Structure and Sequence graph neural networks.Comput Biol Med. 2024 Jul;177:108669. doi: 10.1016/j.compbiomed.2024.108669. Epub 2024 May 29. Comput Biol Med. 2024. PMID: 38833802
-
From sequence to function through structure: Deep learning for protein design.Comput Struct Biotechnol J. 2022 Nov 19;21:238-250. doi: 10.1016/j.csbj.2022.11.014. eCollection 2023. Comput Struct Biotechnol J. 2022. PMID: 36544476 Free PMC article. Review.
-
An experimental analysis of graph representation learning for Gene Ontology based protein function prediction.PeerJ. 2024 Nov 14;12:e18509. doi: 10.7717/peerj.18509. eCollection 2024. PeerJ. 2024. PMID: 39553733 Free PMC article. Review.
References
-
- Reis R, Moraes I (2018) Structural biology and structure-function relationships of membrane proteins. Biochem Soc Trans 47(1):47–61. https://doi.org/10.1042/BST20180269 - DOI - PubMed
-
- Watson JL, Juergens D, Bennett NR et al (2023) De novo design of protein structure and function with RF diffusion. Nature 620(7976):1089–1100. https://doi.org/10.1038/s41586-023-06415-8 - DOI - PubMed - PMC
-
- Gold ND, Jackson RM (2006) Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships. J Mol Biol 355(5):1112–1124. https://doi.org/10.1016/j.jmb.2005.11.044 - DOI - PubMed
-
- Wu MH, Xie Z, Zhi D (2025) A folding-docking-affinity framework for protein-ligand binding affinity prediction. Commun Chem 8(1):1–9. https://doi.org/10.1038/s42004-025-01506-1 - DOI
-
- Zhang H, Gong W, Wu S et al (2021) Studying protein folding in health and disease using biophysical approaches. Emerg Top Life Sci 5(1):29–38. https://doi.org/10.1042/ETLS20200317 - DOI - PubMed - PMC
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous