Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 May 27:20:2699-2712.
doi: 10.1016/j.csbj.2022.05.049. eCollection 2022.

Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward

Affiliations
Review

Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward

Sara Omranian et al. Comput Struct Biotechnol J. .

Abstract

Physically interacting proteins form macromolecule complexes that drive diverse cellular processes. Advances in experimental techniques that capture interactions between proteins provide us with protein-protein interaction (PPI) networks from several model organisms. These datasets have enabled the prediction and other computational analyses of protein complexes. Here we provide a systematic review of the state-of-the-art algorithms for protein complex prediction from PPI networks proposed in the past two decades. The existing approaches that solve this problem are categorized into three groups, including: cluster-quality-based, node affinity-based, and network embedding-based approaches, and we compare and contrast the advantages and disadvantages. We further include a comparative analysis by computing the performance of eighteen methods based on twelve well-established performance measures on four widely used benchmark protein-protein interaction networks. Finally, the limitations and drawbacks of both, current data and approaches, along with the potential solutions in this field are discussed, with emphasis on the points that pave the way for future research efforts in this field.

Keywords: Network Clustering Algorithms; Network embedding; Protein Complex Prediction; Protein-Protein interaction network.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
Categories of the network clustering algorithm used in the protein complex prediction with PPI networks. The network clustering algorithms require as input either only a PPI network (methods in black color) or both on PPI network and biological information (methods in red color). Regardless of the input, the existing network clustering algorithms with applications to complex prediction can be divided into three categories, namely: node affinity-based, cluster quality-based, and network embedding-based methods. For each category, several examples are given and explained in this review. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 2
Fig. 2
Categories of computational approaches to detect protein complexes. Node affinity-based approaches use different node scoring methods, while cluster quality-based approaches cast the protein complex prediction as an optimization problem on PPI networks. However, the next steps to find protein complexes are almost the same for both categories. The network embedding-based approaches predict protein complexes, first by transforming each node to a vector, which is followed by finding similarities between pairs of node vectors. Lastly, they utilize any network clustering algorithms to find protein complexes.
Fig. 3
Fig. 3
GO semantic similarity analysis of protein complexes of gold standards. The distribution of median GO semantic similarity of reference complexes is compared with the randomly generated complexes from altogether five gold standards for three species: (A) E. Coli, (B) S. cerevisiae, and (C) H. Sapiens and their randomized variants.
Fig. 4
Fig. 4
Comparative analysis of approaches for prediction of protein complexes. Eighteen state-of-the-art approaches are applied on four PPI networks of S. cerevisiae, which are (A) Collins, (B) Gavin, (C) KroganCore, and (D) KroganExt. The predicted clusters from different approaches are compared with protein complexes in the gold standard CYC2008. The comparative analysis is conducted with respect to a composite score, which is the summation of four performance measures, maximum matching ratio (MMR), fraction match (FRM), accuracy (ACC), and F-measure. Eighteen approaches are ordered first by their categories, node affinity-based (in brown), cluster quality-based (in green), and network embedding-based (in pink). Second, the methods in each category are ordered by the year of publication. The result indicates that the cluster quality-based methods, more specifically, those that model a protein complex as a biclique spanned subgraph outperformed the others. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

References

    1. Adamcsek, B. et al., 2006. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics, February, Volume 22, p. 1021–1023. - PubMed
    1. Amoutzias G., de Peer Y.V. Evolutionary Genomics and Systems Biology. John Wiley & Sons Inc; s.l.: 2010. Single-Gene and Whole-Genome Duplications and the Evolution of Protein-Protein Interaction Networks; pp. 413–429.
    1. Angeleska, A. & Nikoloski, Z., 2019. Coherent network partitions. Discrete Applied Mathematics, August, Volume 266, p. 283–290.
    1. Angeleska, A., Omranian, S. & Nikoloski, Z., 2021. Coherent network partitions: Characterizations with cographs and prime graphs. Theoretical Computer Science, November, Volume 894, p. 3–11.
    1. Babu, M. et al., 2017. Global landscape of cell envelope protein complexes in Escherichia coli. Nature Biotechnology, November, Volume 36, p. 103–112. - PMC - PubMed

LinkOut - more resources