Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar;85(3):513-527.
doi: 10.1002/prot.25165. Epub 2016 Oct 14.

Human and server docking prediction for CAPRI round 30-35 using LZerD with combined scoring functions

Affiliations

Human and server docking prediction for CAPRI round 30-35 using LZerD with combined scoring functions

Lenna X Peterson et al. Proteins. 2017 Mar.

Abstract

We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc.

Keywords: CAPRI; computational methods; prediction accuracy; protein docking prediction; protein structure prediction; protein-protein docking; structure modeling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Protein docking prediction pipeline used in our group. The tertiary structure of single proteins of a CAPRI target are modeled following the protocol described in Methods. For the human prediction of CAPRI Round 30, we also used structure models selected from CASP server models. Three parallel runs of LZerD protein docking are performed: two runs with(+)/without(−) binding residue constraints taken from prediction by BindML (the gray and white arrows in the diagram) and cons- PPISP using single chain models generated by our lab protocol, and the third LZerD run (only for human prediction, hashed arrows labeled as CASP) using single chain models selected from CASP server predictions. For each of the three tracks, decoys are ranked by ITScorePro, and top 1000 decoys are selected. For LZerD server prediction, top 5 models each from decoys with(+)/without(− ) binding residue constraints using our single chain models were submitted. 1000 models from each track are further reduced to top 10 models by GOAP, which are ranked by PRESCO and DFIRE, independently. Finally, out of the 30 models in total, models that are consistently ranked among the top by two or more scoring functions are chosen in principle for final submission. Usually such models do not fill the ten slots for submission, and rests are filled with models ranked high by either of the scores and visual inspection. Biological information from literature is also applied for final selection if available.
Figure 2
Figure 2
Single and pairwise score distributions for decoys of target T93. This decoy set is a successful example of docking, which contains ten acceptable decoys out of 9999 total. The scatter plots show pairwise score distributions. Acceptable models are shown in squares. Along the diagonal, histograms of the Z-scores of individual scores are shown.
Figure 3
Figure 3
Single and pairwise score distributions for decoys of target T72. This is an unsuccessful decoy set example, which contains no acceptable decoys out of 9999 total.
Figure 4
Figure 4
Score distribution of docking decoys of T91 computed using two single chain models of different quality. T91 is a homo-dimer, but the two subunits have slightly different conformations in the native structure, which resulted in different RMSD values for each model compared to the two subunits. The first model has RMSDs of 5.4/5.5 Å to the native structures of the two chains. Another model, a CASP server model (Zhang-Server_TS1), has RMSDs of 4.1/5.1 Å (Tab. S1). A, Distributions of Z-score of GOAP and DFIRE. Left, docking decoys from the Zhang-Server_TS1 single chain model. There are 37 acceptable decoys and one medium decoy out of 4793 total. Right, decoys from the former single chain model computed in our group. No interface prediction was applied. There are nine acceptable decoys out of 6168 total. Acceptable and medium quality models are shown in gold squares and green triangles, respectively. The left bottom corner (labeled A) are subsets of decoys that have Z-score below n = 2 for the two scores (Equation 1). The Spearman correlation coefficient for the decoys in A (Equation 2) is 0.56 (p = 0.0002) for the left distribution, and 0.17 (p = 0.5) for the right. B, Two single chain models superimposed to its native structure, T91, chain C. Green, native; blue, Zhang-Server_TS1; orange, our model. C, the best model from our submission (orange) superimposed to the native complex strucure (green). fnat: 0.33, L-RMSD: 9.0 Å; I-RMSD 4.2 Å.
Figure 5
Figure 5
Prediction of decoy pool quality based on score pair distribution shape. “Funnel score” is the sum of n over all score pairs where the SCC for the -outliers is significant (p < 0.05) and greater than 0.4 (Equation 3). The dotted line indicates a minimum Funnel score of 3, which classifies 9 true positives, 4 true negatives, 2 false positives (T77 and T88), and 4 false negatives (T75, T86, T89, and T92).

Similar articles

Cited by

References

    1. Kihara D, Skolnick J. Microbial genomes have over 72threading algorithm PROSPECTOR_Q. Proteins: Struct, Funct, Bioinf. 2004;55:464–473. - PubMed
    1. Chen H, Kihara D. Effect of using suboptimal alignments in template-based protein structure prediction. Proteins: Struct, Funct, Bioinf. 2011;79:315–34. - PMC - PubMed
    1. Pieper U, Webb BM, Dong GQ, Schneidman-Duhovny D, Fan H, Kim SJ, Khuri N, Spill YG, Weinkam P, Hammel M, Tainer JA, Nilges M, Sali A. ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res. 2014;42:D336–D346. - PMC - PubMed
    1. King NP, Sheffler W, Sawaya MR, Vollmar BS, Sumida JP, André I, Gonen T, Yeates TO, Baker D. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science. 2012;336:1171–1174. - PMC - PubMed
    1. Gonen S, DiMaio F, Gonen T, Baker D. Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces. Science. 2015;348:1365–8. - PubMed

Publication types

LinkOut - more resources