iScore: a novel graph kernel-based function for scoring protein-protein docking models

Cunliang Geng¹, Yong Jung^{2

3

4}, Nicolas Renaud⁵, Vasant Honavar^{2

3

4

6

7

8

9}, Alexandre M J J Bonvin¹, Li C Xue¹

Affiliations

¹ Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands.
² Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA.
³ Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA.
⁴ Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA.
⁵ Netherlands eScience Center, Amsterdam 1098 XG, The Netherlands.
⁶ Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA.
⁷ Institute for Cyberscience, University Park, PA 16802, USA.
⁸ Clinical and Translational Sciences Institute, University Park, PA 16802, USA.
⁹ College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA.

PMID: 31199455
PMCID: PMC6956772
DOI: 10.1093/bioinformatics/btz496

iScore: a novel graph kernel-based function for scoring protein-protein docking models

Cunliang Geng et al. Bioinformatics. 2020.

. 2020 Jan 1;36(1):112-121.

doi: 10.1093/bioinformatics/btz496.

Authors

Cunliang Geng¹, Yong Jung^{2

3

4}, Nicolas Renaud⁵, Vasant Honavar^{2

3

4

6

7

8

9}, Alexandre M J J Bonvin¹, Li C Xue¹

Affiliations

¹ Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands.
² Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA.
³ Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA.
⁴ Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA.
⁵ Netherlands eScience Center, Amsterdam 1098 XG, The Netherlands.
⁶ Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA.
⁷ Institute for Cyberscience, University Park, PA 16802, USA.
⁸ Clinical and Translational Sciences Institute, University Park, PA 16802, USA.
⁹ College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA.

PMID: 31199455
PMCID: PMC6956772
DOI: 10.1093/bioinformatics/btz496

Abstract

Motivation: Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge.

Results: Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein-protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes.

Availability and implementation: The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Schematic workflow of our graph kernel-based scoring method. Docking models for a protein–protein complex are first represented as graphs by treating the interface residues as graph nodes and the intermolecular contacts they form as graph edges. Interface features are added to the graph as node or edge labels (only PSSM profiles as node labels in this case). Then, each of the interface graphs of the docking models is compared to the interface graphs of both the positive (native) structure and negative (non-native) models. This graph comparison generates a similarity matrix for the docking models with the number of rows and columns corresponding to the number of docking models and the total number of positive and negative graphs, respectively. Next, the support vector machine takes the graph kernel matrix as input and predicts decision values that are used as the GraphRank score. The final scoring function iScore is a linear combination of the GraphRank score and HADDOCK energetic terms (van der Waals, electrostatic and desolvation energies). The weights of this linear combination are optimized using the genetic algorithm (GA) over the BM4 HADDOCK dataset

**Fig. 2.**
Success rate of HADDOCK score, GraphRank and iScore on the BM4 HADDOCK training dataset over top N clusters of models

**Fig. 3.**
Success rates measured at cluster level on four sets of docking program-specific models for newly added protein–protein complexes in BM5. GraphRank and iScore are compared with scoring functions from HADDOCK (A), SwarmDock (B), pyDock (C) and ZDock (D) on the docking models of the corresponding docking program, respectively

See this image and copyright information in PMC

References

1. Aloy P., Russell R.B. (2006) Structural systems biology: modelling protein interactions. Nat. Rev. Mol. Cell Biol., 7, 188–197. - PubMed
1. Altschul S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed
1. Andreani J., Guerois R. (2014) Evolution of protein interactions: from interactomes to interfaces. Arch. Biochem. Biophys., 554, 65–75. - PubMed
1. Andreani J. et al. (2013) InterEvScore: a novel coarse-grained interface scoring function using a multi-body statistical potential coupled to evolution. Bioinformatics, 29, 1742–1749. - PubMed
1. Borgwardt K.M. et al. (2005) Protein function prediction via graph kernels. Bioinformatics, 21, i47–i56. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

iScore: a novel graph kernel-based function for scoring protein-protein docking models

Affiliations

iScore: a novel graph kernel-based function for scoring protein-protein docking models

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources