Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan-Dec;19(1):e70024.
doi: 10.1049/syb2.70024.

Proteins Combined Score Prediction Based on Improved Gene Expression Programming Algorithm and Protein-Protein Interaction Network Characterization

Affiliations

Proteins Combined Score Prediction Based on Improved Gene Expression Programming Algorithm and Protein-Protein Interaction Network Characterization

Sicong Huo et al. IET Syst Biol. 2025 Jan-Dec.

Abstract

Predicting the combined score in protein-protein interaction (PPI) networks represents a critical research focus in bioinformatics, as it contributes to enhancing the accuracy of PPI data and uncovering the inherent complexity of biological systems. However, existing intelligent algorithms encounter significant challenges in effectively integrating heterogeneous data sources, capturing the nonlinear dependencies within PPI networks, and improving model generalizability. To address these limitations, this study introduces an enhanced gene expression programming (DF-GEP) algorithm that incorporates dynamic factor optimization. The proposed DF-GEP framework integrates Spearman correlation analysis with kernel ridge regression (SC-KRR) to extract and assign refined weights to key PPI network features. Additionally, the algorithm adaptively regulates selection, crossover, mutation and fitness evaluation processes via dynamic factor adjustment, thereby improving adaptability and predictive precision. Experimental results show that the DF-GEP algorithm consistently outperforms baseline models in both predictive accuracy and stability. Beyond its application to PPI-combined score prediction, the proposed algorithm also exhibits strong potential for addressing complex nonlinear problems in other domains.

Keywords: biology computing; data mining; genetic algorithms; principal component analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
Model design flow.
FIGURE 2
FIGURE 2
Multiplication matrix model design.
FIGURE 3
FIGURE 3
SC‐KRR algorithm flow.
FIGURE 4
FIGURE 4
GEP algorithm construction.
FIGURE 5
FIGURE 5
DF‐GEP algorithm flow.
FIGURE 6
FIGURE 6
Initial PPI network maps of ALK, TP53, and PTEN proteins. (a) ALK PPI network; (b) TP53 PPI network; (c) PTEN PPI network.
FIGURE 7
FIGURE 7
Cytoscape analysis of ALK, TP53, and PTEN PPI networks. (a) Analysing PPI network maps of ALK proteins using Cytoscape software; (b) Analysing PPI network maps of TP53 proteins using Cytoscape software; (c) Analysing PPI network maps of PTEN proteins using Cytoscape software.
FIGURE 8
FIGURE 8
Experimental design process.
FIGURE 9
FIGURE 9
Characterization of ALK, TP53 and PTEN protein PPI networks and combined score association analysis. (a) Correlation analysis of ALK protein PPI network properties with combined score; (b) Correlation analysis graph of TP53 protein PPI network properties with combined score; (c) Correlation analysis of PTEN protein PPI network properties with combined score; (d) ALK, TP53, PTEN proteins PPI network properties weight distribution.
FIGURE 10
FIGURE 10
Comparison of performance and generalization ability of different models on ALK, TP53 and PTEN protein datasets. (a) Comparison of the generalization ability of different models for the ALK protein dataset; (b) Comparison of the performance of different models for the TP53 protein dataset; (c) Comparison of the performance of different models for the PTEN protein dataset.
FIGURE 11
FIGURE 11
Comparison of different algorithms for successful classification in ALK, TP53 and PTEN protein classification tasks. (a) ALK protein successful classification comparison across algorithms. (b) TP‐53 protein successful classification comparison across algorithms. (c) PTEN protein successful classification comparison across algorithms.
FIGURE 12
FIGURE 12
Prediction error of each algorithm and its 95% confidence interval (Bootstrap, n = 400).
FIGURE 13
FIGURE 13
ROC curves and AUC comparison of different models on binary classification (threshold = 700).
FIGURE 14
FIGURE 14
Performance comparison of models based on AUC, precision, recall and F1 score.

Similar articles

References

    1. Gligorijević V., Renfrew P. D., Kosciolek T., et al., “Structure‐Based Protein Function Prediction Using Graph Convolutional Networks,” Nature Communications 12, no. 1 (2021): 3168, 10.1038/s41467-021-23303-9. - DOI - PMC - PubMed
    1. Lin B., Luo X., Liu Y., and Jin X., “A Comprehensive Review and Comparison of Existing Computational Methods for Protein Function Prediction,” Briefings in Bioinformatics 25, no. 4 (2024): bbae289, 10.1093/bib/bbae289. - DOI - PMC - PubMed
    1. Sumida K. H., Núñez‐Franco R., Kalvet I., et al., “Improving Protein Expression, Stability, and Function With ProteinMPNN,” Journal of the American Chemical Society 146, no. 3 (2024): 2054–2061, 10.1021/jacs.3c10941. - DOI - PMC - PubMed
    1. Jha K., Saha S., and Singh H., “Prediction of Protein‐Protein Interaction Using Graph Neural Networks,” Scientific Reports 12, no. 1 (2022): 8360, 10.1038/s41598-022-12201-9. - DOI - PMC - PubMed
    1. Ma W., Zhang S., Li Z., et al., “Enhancing Protein Function Prediction Performance by Utilizing AlphaFold‐Predicted Protein Structures,” Journal of Chemical Information and Modeling 62, no. 17 (2022): 4008–4017, 10.1021/acs.jcim.2c00885. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources