Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 17:12:645932.
doi: 10.3389/fgene.2021.645932. eCollection 2021.

Method for Essential Protein Prediction Based on a Novel Weighted Protein-Domain Interaction Network

Affiliations

Method for Essential Protein Prediction Based on a Novel Weighted Protein-Domain Interaction Network

Zixuan Meng et al. Front Genet. .

Abstract

In recent years a number of calculative models based on protein-protein interaction (PPI) networks have been proposed successively. However, due to false positives, false negatives, and the incompleteness of PPI networks, there are still many challenges affecting the design of computational models with satisfactory predictive accuracy when inferring key proteins. This study proposes a prediction model called WPDINM for detecting key proteins based on a novel weighted protein-domain interaction (PDI) network. In WPDINM, a weighted PPI network is constructed first by combining the gene expression data of proteins with topological information extracted from the original PPI network. Simultaneously, a weighted domain-domain interaction (DDI) network is constructed based on the original PDI network. Next, through integrating the newly obtained weighted PPI network and weighted DDI network with the original PDI network, a weighted PDI network is further constructed. Then, based on topological features and biological information, including the subcellular localization and orthologous information of proteins, a novel PageRank-based iterative algorithm is designed and implemented on the newly constructed weighted PDI network to estimate the criticality of proteins. Finally, to assess the prediction performance of WPDINM, we compared it with 12 kinds of competitive measures. Experimental results show that WPDINM can achieve a predictive accuracy rate of 90.19, 81.96, 70.72, 62.04, 55.83, and 51.13% in the top 1%, top 5%, top 10%, top 15%, top 20%, and top 25% separately, which exceeds the prediction accuracy achieved by traditional state-of-the-art competing measures. Owing to the satisfactory identification effect, the WPDINM measure may contribute to the further development of key protein identification.

Keywords: computational model; domain-domain interaction network; essential proteins; protein-domain interaction network; protein-protein interaction network.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The handling editor QZ declared a past co-authorship/collaboration with one of the authors LW.

Figures

FIGURE 1
FIGURE 1
Flowchart of WPDINM.
FIGURE 2
FIGURE 2
(A) Top 1% ranked proteins, (B) Top 5% ranked proteins, (C) Top 10% ranked proteins, (D) Top 15% ranked proteins, (E) Top 20% ranked proteins, (F) Top 25% ranked proteins. This bar chart shows the comparison of the number of essential proteins predicted by WPDINM and other models, such as SC, BC, CC, DC, IC, EC, NC, Pec, CoEWC, POEM, ION, TEGS based on the DIP database.
FIGURE 3
FIGURE 3
(A) Top 1% ranked proteins, (B) Top 5% ranked proteins, (C) Top 10% ranked proteins, (D) Top 15% ranked proteins, (E) Top 20% ranked proteins, (F) Top 25% ranked proteins. This bar chart shows the comparison of the number of essential proteins predicted by WPDINM and other models, such as SC, BC, CC, DC, IC, EC, NC, Pec, CoEWC, POEM, ION, TEGS based on the Krogan database.
FIGURE 4
FIGURE 4
Comparison of Jackknife curves of the WPDINM and 12 kinds of methods based on the DIP dataset. (A) Comparison of Jackknife curves of WPDINM and 6 other methods including DC, IC, EC, BC, CC, NC. (B) Comparison of Jackknife curves of WPDINM and 6 other measures such as SC, Pec, CoEWC, POEM, ION, TEGS.
FIGURE 5
FIGURE 5
Comparison of Jackknife curves of the WPDINM and 12 kinds of methods based on the Krogan database. (A) Comparison of Jackknife curves of WPDINM and 6 other methods including DC, IC, EC, BC, CC, and NC. (B) Comparison of Jackknife curves of WPDINM and 6 other measures such as SC, Pec, CoEWC, POEM, ION, TEGS.
FIGURE 6
FIGURE 6
Comparison of PR curves and ROC curves between WPDINM and the other competing methods based on the DIP database. (A) The PR curves of DC, BC, SC, NC. (B) The ROC curves of DC, BC, SC, NC. (C) The PR curves of EC, IC, CC, PeC. (D) The ROC curves of EC, IC, CC, PeC. (E) The PR curves of CoEWC, POEM, ION, TEGS. (F) The ROC curves of CoEWC, POEM, ION, TEGS.
FIGURE 7
FIGURE 7
Comparison of PR curves and ROC curves between WPDINM and the other competing methods based on the Krogan database. (A) The PR curves of DC, CC, EC, NC. (B) The ROC curves of DC, CC, EC, NC. (C) The PR curves of IC, SC, PeC, BC. (D) The ROC curves of IC, SC, PeC, BC. (E) The PR curves of CoEWC, POEM, ION, TEGS. (F) The ROC curves of CoEWC, POEM, ION, TEGS.

Similar articles

Cited by

References

    1. Bateman A., Coin L., Durbin R., Finn R. D., Hollich V., Griffithsjones S., et al. (2004). The pfam protein families database. Nucleic Acids Res. 42 D222–D230. - PubMed
    1. Binder J. X., Sune P. F., Kalliopi T., Christian S., O’DonoghueSeán I., Reinhard S., et al. (2014). Compartments: unification and visualization of protein subcellular localization evidence. Database J. Biol. Databases Curation 2014:bau012. 10.1093/database/bau012 - DOI - PMC - PubMed
    1. Bonacich P. (1987). Power and centrality: a family of measures. Am. J. Sociol. 92 1170–1182. 10.2307/2780000 - DOI
    1. Chen J., Yuan B. (2006). Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22 2283–2290. 10.1093/bioinformatics/btl370 - DOI - PubMed
    1. Cherry J. M., Adler C., Ball C., Chervitz S. A., Dwight S. S., Hester E. T. (1998). SGD: saccharomyces genome database. Nucleic Acids Res. 26 73–79. 10.1093/nar/26.1.73 - DOI - PMC - PubMed

LinkOut - more resources