Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 15;34(20):3461-3469.
doi: 10.1093/bioinformatics/bty355.

Efficient flexible backbone protein-protein docking for challenging targets

Affiliations

Efficient flexible backbone protein-protein docking for challenging targets

Nicholas A Marze et al. Bioinformatics. .

Abstract

Motivation: Binding-induced conformational changes challenge current computational docking algorithms by exponentially increasing the conformational space to be explored. To restrict this search to relevant space, some computational docking algorithms exploit the inherent flexibility of the protein monomers to simulate conformational selection from pre-generated ensembles. As the ensemble size expands with increased flexibility, these methods struggle with efficiency and high false positive rates.

Results: Here, we develop and benchmark RosettaDock 4.0, which efficiently samples large conformational ensembles of flexible proteins and docks them using a novel, six-dimensional, coarse-grained score function. A strong discriminative ability allows an eight-fold higher enrichment of near-native candidate structures in the coarse-grained phase compared to RosettaDock 3.2. It adaptively samples 100 conformations each of the ligand and the receptor backbone while increasing computational time by only 20-80%. In local docking of a benchmark set of 88 proteins of varying degrees of flexibility, the expected success rate (defined as cases with ≥50% chance of achieving 3 near-native structures in the 5 top-ranked ones) for blind predictions after resampling is 77% for rigid complexes, 49% for moderately flexible complexes and 31% for highly flexible complexes. These success rates on flexible complexes are a substantial step forward from all existing methods. Additionally, for highly flexible proteins, we demonstrate that when a suitable conformer generation method exists, the method successfully docks the complex.

Availability and implementation: As a part of the Rosetta software suite, RosettaDock 4.0 is available at https://www.rosettacommons.org to all non-commercial users for free and to commercial users for a fee.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Amount of backbone sampling in RosettaDock 4.0. (A) Modulation of backbone conformer swap trials in Rosetta 4.0 for each of the first 8 cycles of Monte Carlo moves in the low-resolution search stage. The dashed line indicates the number of trials for each of the different moves in RosettaDock 3.2. Adaptive conformer selection in RosettaDock 4.0 ensures increased backbone swapping frequency for Clp protease adapter over ClpA chaperone, which is less flexible at the interface. (B) Comparison of the number of self-swaps versus swaps to other conformations in RosettaDock 3.2 versus Rosetta 4.0 for the highly flexible CCS metallochaperone: superoxide dismutase complex. RosettaDock 4.0 has increased backbone sampling both in the number and fraction of other conformations sampled
Fig. 2.
Fig. 2.
Time comparison of the docking protocols for large ensembles. Average time per decoy for RosettaDock 3.2 (x) and 4.0 (+) with ensembles having 100 receptor and 100 ligand conformations for complexes ranging from 191 to 1026 total residues. Adaptive Conformer Sampling makes RosettaDock 4.0 up to 12 times faster for cases with large interfaces
Fig. 3.
Fig. 3.
Low-resolution score versus RMSD from native plots for two examples, viz. Ras: RALGDS domain complex (A and B) and BET3: TPC6 complex (C and D). (A and C) 10 000 models generated by RosettaDock 3.2 using the centroid score function, and (B and D) 10 000 models generated by RosettaDock 4.0 using motif dock score (MDS) function. (A) Centroid score does not generate many near-native candidate structures, and it cannot distinguish them from incorrect models. All metrics indicate failure: N5 =0, N100 =0, N1000 =23. (B) MDS generates a large number of near-native candidate structures, and discriminates them from incorrect models. All metrics indicate success: N5 =5, N100 =95, N1000 =750. (C) N5 =1 indicates discrimination failure, but N100 =86 and N1000 =673 indicate that the broader set is enriched in near-native structures. (D) All metrics indicate success: N5 =5, N100 =98, N1000 =813
Fig. 4.
Fig. 4.
Comparison of performance metrics between RosettaDock 3.2 and RosettaDock 4.0 for individual complexes in the benchmark. Targets are represented by different symbols corresponding to their difficulty category (circle: rigid; triangle: medium; diamond: flexible). Points above the solid line represent better performance in RosettaDock 4.0, while points below the line represent better performance in RosettaDock 3.2. Comparison of (A) ⟨E1%⟩ enrichment values between the two protocols on a log-log axes. ⟨E1%⟩ shows marked improvement in the vast majority of the complexes. Dashed lines demarcate regions where the low-scoring set is enriched in near-native structures. Comparison of ⟨N5⟩ values (B) after low-resolution stage, and (C) after high-resolution stage (full protocol). Dashed lines highlight the region in which the two protocols differ significantly, i.e. by more than one point in their ⟨N5⟩ values. After the full protocol, 23 of the 88 complexes are modeled significantly better and 7 complexes are modeled significantly worse
Fig. 5.
Fig. 5.
Improvement in docking performance of RosettaDock 4.0 by doping the ensemble with near-bound decoys for SRP GTPase: FtsY complex. Score versus RMSD plot of runs with (A) backbone conformations generated using NMA, Backrub and Relax protocols, and (B) ensembles doped with 10% near-bound conformations. (A) Without the ensemble doping, the simulations did not generate medium- or high-quality docked structures, and the acceptable structures did not score low enough to be discriminated from incorrect structures. (B) Ensemble doping generated deep docking funnels with high-quality structures. Colored points indicate CAPRI-quality category for each decoy, and the blue points provide a reference energy of the refined, bound crystal structure. (C and D) Plot of the contact-residue RMSD from the bound conformation for the ligand and the receptor conformers selected after the docking simulation for (C) ensembles without near-native doping, and (D) ensembles with 10% near-bound conformations doped. The RMSD values of the unbound conformations are marked with a green line segment, and those of the near-bound conformations are marked in colors corresponding to the biasing constraint weight. (C) The conformer generation methods are unable to generate sub-Å contact-residue RMSD structures starting from the unbound ligand conformation (with RMSD of 3.57 Å) and the unbound receptor conformation (with RMSD of 2.92 Å). (D) Four of the biased conformations of the ligand and five of the receptor are within 1 Å RMSD from the bound state. RosettaDock 4.0 is able to recognize these close conformations, find the native-like interface and successfully dock the complex
Fig. 6.
Fig. 6.
Efficiency of RosettaDock 4.0 on large ensembles. Despite sampling 100 conformations each of the receptor and the ligand as compared to 1 receptor and 10 ligand conformations in RosettaDock 3.2, the time per decoy for RosettaDock 4.0 is 20-80% more in 77 of the 88 targets tested

References

    1. Alford R.F. et al. (2017) The Rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput., 13, 3031–3048. - PMC - PubMed
    1. Anishchenko I. et al. (2015) Structural templates for comparative protein docking. Proteins, 83, 1563–1570. - PMC - PubMed
    1. Atilgan A.R. et al. (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J., 80, 505–515. - PMC - PubMed
    1. Baaden M., Marrink S.J. (2013) Coarse-grain modelling of protein–protein interactions. Curr. Opin. Struct. Biol., 23, 878–886. - PubMed
    1. Berman H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. - PMC - PubMed

Publication types