. 2018 Oct 15;34(20):3461-3469.

doi: 10.1093/bioinformatics/bty355.

Efficient flexible backbone protein-protein docking for challenging targets

Nicholas A Marze¹, Shourya S Roy Burman¹, William Sheffler^{2

3}, Jeffrey J Gray^{1

4

5

6}

Affiliations

¹ Department of Chemical & Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA.
² Department of Biochemistry, University of Washington, Seattle, WA, USA.
³ Institute for Protein Design, University of Washington, Seattle, WA, USA.
⁴ Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, USA.
⁵ Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD, USA.
⁶ Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA.

PMID: 29718115
PMCID: PMC6184633
DOI: 10.1093/bioinformatics/bty355

Efficient flexible backbone protein-protein docking for challenging targets

Nicholas A Marze et al. Bioinformatics. 2018.

. 2018 Oct 15;34(20):3461-3469.

doi: 10.1093/bioinformatics/bty355.

Authors

Nicholas A Marze¹, Shourya S Roy Burman¹, William Sheffler^{2

3}, Jeffrey J Gray^{1

4

5

6}

Affiliations

¹ Department of Chemical & Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA.
² Department of Biochemistry, University of Washington, Seattle, WA, USA.
³ Institute for Protein Design, University of Washington, Seattle, WA, USA.
⁴ Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, USA.
⁵ Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, MD, USA.
⁶ Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA.

PMID: 29718115
PMCID: PMC6184633
DOI: 10.1093/bioinformatics/bty355

Abstract

Motivation: Binding-induced conformational changes challenge current computational docking algorithms by exponentially increasing the conformational space to be explored. To restrict this search to relevant space, some computational docking algorithms exploit the inherent flexibility of the protein monomers to simulate conformational selection from pre-generated ensembles. As the ensemble size expands with increased flexibility, these methods struggle with efficiency and high false positive rates.

Results: Here, we develop and benchmark RosettaDock 4.0, which efficiently samples large conformational ensembles of flexible proteins and docks them using a novel, six-dimensional, coarse-grained score function. A strong discriminative ability allows an eight-fold higher enrichment of near-native candidate structures in the coarse-grained phase compared to RosettaDock 3.2. It adaptively samples 100 conformations each of the ligand and the receptor backbone while increasing computational time by only 20-80%. In local docking of a benchmark set of 88 proteins of varying degrees of flexibility, the expected success rate (defined as cases with ≥50% chance of achieving 3 near-native structures in the 5 top-ranked ones) for blind predictions after resampling is 77% for rigid complexes, 49% for moderately flexible complexes and 31% for highly flexible complexes. These success rates on flexible complexes are a substantial step forward from all existing methods. Additionally, for highly flexible proteins, we demonstrate that when a suitable conformer generation method exists, the method successfully docks the complex.

Availability and implementation: As a part of the Rosetta software suite, RosettaDock 4.0 is available at https://www.rosettacommons.org to all non-commercial users for free and to commercial users for a fee.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Amount of backbone sampling in RosettaDock 4.0. (A) Modulation of backbone conformer swap trials in Rosetta 4.0 for each of the first 8 cycles of Monte Carlo moves in the low-resolution search stage. The dashed line indicates the number of trials for each of the different moves in RosettaDock 3.2. Adaptive conformer selection in RosettaDock 4.0 ensures increased backbone swapping frequency for Clp protease adapter over ClpA chaperone, which is less flexible at the interface. (B) Comparison of the number of self-swaps versus swaps to other conformations in RosettaDock 3.2 versus Rosetta 4.0 for the highly flexible CCS metallochaperone: superoxide dismutase complex. RosettaDock 4.0 has increased backbone sampling both in the number and fraction of other conformations sampled

**Fig. 2.**
Time comparison of the docking protocols for large ensembles. Average time per decoy for RosettaDock 3.2 (x) and 4.0 (+) with ensembles having 100 receptor and 100 ligand conformations for complexes ranging from 191 to 1026 total residues. Adaptive Conformer Sampling makes RosettaDock 4.0 up to 12 times faster for cases with large interfaces

**Fig. 3.**
Low-resolution score versus RMSD from native plots for two examples, *viz.* Ras: RALGDS domain complex (A and B) and BET3: TPC6 complex (C and D). (A and C) 10 000 models generated by RosettaDock 3.2 using the centroid score function, and (B and D) 10 000 models generated by RosettaDock 4.0 using motif dock score (MDS) function. (A) Centroid score does not generate many near-native candidate structures, and it cannot distinguish them from incorrect models. All metrics indicate failure: N5 = 0, *N100* = 0, *N1000* = 23. (B) MDS generates a large number of near-native candidate structures, and discriminates them from incorrect models. All metrics indicate success: N5 = 5, *N100* = 95, *N1000* = 750. (C) N5 = 1 indicates discrimination failure, but *N100* = 86 and *N1000* = 673 indicate that the broader set is enriched in near-native structures. (D) All metrics indicate success: N5 = 5, *N100* = 98, *N1000* = 813

**Fig. 4.**
Comparison of performance metrics between RosettaDock 3.2 and RosettaDock 4.0 for individual complexes in the benchmark. Targets are represented by different symbols corresponding to their difficulty category (circle: rigid; triangle: medium; diamond: flexible). Points above the solid line represent better performance in RosettaDock 4.0, while points below the line represent better performance in RosettaDock 3.2. Comparison of (A) ⟨*E_1%*⟩ enrichment values between the two protocols on a log-log axes. ⟨*E_1%*⟩ shows marked improvement in the vast majority of the complexes. Dashed lines demarcate regions where the low-scoring set is enriched in near-native structures. Comparison of ⟨N5⟩ values (B) after low-resolution stage, and (C) after high-resolution stage (full protocol). Dashed lines highlight the region in which the two protocols differ significantly, i.e. by more than one point in their ⟨N5⟩ values. After the full protocol, 23 of the 88 complexes are modeled significantly better and 7 complexes are modeled significantly worse

**Fig. 5.**
Improvement in docking performance of RosettaDock 4.0 by doping the ensemble with near-bound decoys for SRP GTPase: FtsY complex. Score versus RMSD plot of runs with (A) backbone conformations generated using NMA, Backrub and Relax protocols, and (B) ensembles doped with 10% near-bound conformations. (A) Without the ensemble doping, the simulations did not generate medium- or high-quality docked structures, and the acceptable structures did not score low enough to be discriminated from incorrect structures. (B) Ensemble doping generated deep docking funnels with high-quality structures. Colored points indicate CAPRI-quality category for each decoy, and the blue points provide a reference energy of the refined, bound crystal structure. (C and D) Plot of the contact-residue RMSD_Cα from the bound conformation for the ligand and the receptor conformers selected after the docking simulation for (C) ensembles without near-native doping, and (D) ensembles with 10% near-bound conformations doped. The RMSD values of the unbound conformations are marked with a green line segment, and those of the near-bound conformations are marked in colors corresponding to the biasing constraint weight. (C) The conformer generation methods are unable to generate sub-Å contact-residue RMSD_Cα structures starting from the unbound ligand conformation (with RMSD_Cα of 3.57 Å) and the unbound receptor conformation (with RMSD_Cα of 2.92 Å). (D) Four of the biased conformations of the ligand and five of the receptor are within 1 Å RMSD_Cα from the bound state. RosettaDock 4.0 is able to recognize these close conformations, find the native-like interface and successfully dock the complex

**Fig. 6.**
Efficiency of RosettaDock 4.0 on large ensembles. Despite sampling 100 conformations each of the receptor and the ligand as compared to 1 receptor and 10 ligand conformations in RosettaDock 3.2, the time per decoy for RosettaDock 4.0 is 20-80% more in 77 of the 88 targets tested

See this image and copyright information in PMC

References

1. Alford R.F. et al. (2017) The Rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput., 13, 3031–3048. - PMC - PubMed
1. Anishchenko I. et al. (2015) Structural templates for comparative protein docking. Proteins, 83, 1563–1570. - PMC - PubMed
1. Atilgan A.R. et al. (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J., 80, 505–515. - PMC - PubMed
1. Baaden M., Marrink S.J. (2013) Coarse-grain modelling of protein–protein interactions. Curr. Opin. Struct. Biol., 23, 878–886. - PubMed
1. Berman H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Efficient flexible backbone protein-protein docking for challenging targets

Affiliations

Efficient flexible backbone protein-protein docking for challenging targets

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials