Overview of the SAMPL5 host-guest challenge: Are we doing better?

Jian Yin¹, Niel M Henriksen¹, David R Slochower¹, Michael R Shirts², Michael W Chiu³, David L Mobley⁴, Michael K Gilson⁵

Affiliations

¹ Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
² Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80309, USA.
³ Qualcomm Institute, University of California, San Diego, La Jolla, CA, 92093, USA.
⁴ Departments of Pharmaceutical Sciences and Chemistry, University of California Irvine, Irvine, CA, 92697, USA.
⁵ Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA. mgilson@ucsd.edu.

PMID: 27658802
PMCID: PMC5241188
DOI: 10.1007/s10822-016-9974-4

Review

Overview of the SAMPL5 host-guest challenge: Are we doing better?

Jian Yin et al. J Comput Aided Mol Des. 2017 Jan.

. 2017 Jan;31(1):1-19.

doi: 10.1007/s10822-016-9974-4. Epub 2016 Sep 22.

Authors

Jian Yin¹, Niel M Henriksen¹, David R Slochower¹, Michael R Shirts², Michael W Chiu³, David L Mobley⁴, Michael K Gilson⁵

Affiliations

¹ Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
² Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80309, USA.
³ Qualcomm Institute, University of California, San Diego, La Jolla, CA, 92093, USA.
⁴ Departments of Pharmaceutical Sciences and Chemistry, University of California Irvine, Irvine, CA, 92697, USA.
⁵ Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA. mgilson@ucsd.edu.

PMID: 27658802
PMCID: PMC5241188
DOI: 10.1007/s10822-016-9974-4

Abstract

The ability to computationally predict protein-small molecule binding affinities with high accuracy would accelerate drug discovery and reduce its cost by eliminating rounds of trial-and-error synthesis and experimental evaluation of candidate ligands. As academic and industrial groups work toward this capability, there is an ongoing need for datasets that can be used to rigorously test new computational methods. Although protein-ligand data are clearly important for this purpose, their size and complexity make it difficult to obtain well-converged results and to troubleshoot computational methods. Host-guest systems offer a valuable alternative class of test cases, as they exemplify noncovalent molecular recognition but are far smaller and simpler. As a consequence, host-guest systems have been part of the prior two rounds of SAMPL prediction exercises, and they also figure in the present SAMPL5 round. In addition to being blinded, and thus avoiding biases that may arise in retrospective studies, the SAMPL challenges have the merit of focusing multiple researchers on a common set of molecular systems, so that methods may be compared and ideas exchanged. The present paper provides an overview of the host-guest component of SAMPL5, which centers on three different hosts, two octa-acids and a glycoluril-based molecular clip, and two different sets of guest molecules, in aqueous solution. A range of methods were applied, including electronic structure calculations with implicit solvent models; methods that combine empirical force fields with implicit solvent models; and explicit solvent free energy simulations. The most reliable methods tend to fall in the latter class, consistent with results in prior SAMPL rounds, but the level of accuracy is still below that sought for reliable computer-aided drug design. Advances in force field accuracy, modeling of protonation equilibria, electronic structure methods, and solvent models, hold promise for future improvements.

Keywords: Binding affinity; Blind challenge; Computer-aided drug design; Host–guest; Molecular recognition.

PubMed Disclaimer

Figures

**Fig. 1**
Structures of host OAH, OAMe, CBClip and their guest molecules. OA and OAMe are also known as OA and TEMOA, respectively. All host molecules are shown in two perspectives. *Silver* carbon, *Blue* nitrogen, *Red* oxygen, *Yellow* sulfur. Non-polar hydrogen atoms were omitted for clarity. OA-G1–OA-G6 are the common guest molecules for OAH and OAMe, and CBC-G1–CBC-G10 are guests for CBClip. Protonation states of all host and guest molecules shown in the figure were suggested by the organizers based on the expected pKas and the experimental pH values

**Fig. 2**
OAH/OAMe submissions ranked based on the original values of absolute error metrics (*white circles*), which were computed from reported binding affinities without resampling or considering any uncertainty sources. The violin plot describes the shape of the sampling distribution for each set of predictions when bootstrapping 100,000 samples with replacement, and the vertical bar represents the mean of the distribution. The computational uncertainties are absent in the Null1, MovTyp-1, and MoveTyp-2 predictions. Two null models are shown in *red*. The violin plot area, here and below, are normalized not to unity, but instead to give the same maximum thickness

**Fig. 3**
OAH/OAMe submissions ranked based on the original values of offset error metrics (*white circles*), which were computed from reported binding affinities without resampling or considering any uncertainty sources. The violin plot describes the shape of the sampling distribution for each set of predictions when bootstrapping 100,000 samples with replacement, and the vertical bar represents the mean of the distribution. The computational uncertainties are absent in Null1 model, MovTyp-1, MoveTyp-2 DFT/TPSS-n, DFT/TPSS-C and DLPNO-CCSD(T) predictions. Two null models are shown in *red*

**Fig. 4**
CBClip submissions ranked based on the original values of absolute error metrics (*white circles*), which were computed from reported binding affinities without resampling or considering any uncertainty sources. The violin plot describes the shape of the sampling distribution for each set of predictions when bootstrapping 100,000 samples with replacement, and the vertical bar represents the mean of the distribution. Two null models are shown in red. The computational uncertainties are absent in Null1 model, MovTyp-1 and MoveTyp-2 predictions

**Fig. 5**
Combined OAH/OAMe predictions with MSE offsets using a APR-TIP3P, b SOMD-3, and c DFT/TPSS-n method. CBClip predictions without MSE offset using d the Null2 model, e SOMD-3, and f BEDAM method. *Purple dots* OAH, *red dots* OAMe, *cyan dots* CBClip, *solid black* line of identity

**Fig. 6**
Structures of host H1 and cucurbit[7]uril (CB7) tested in prior SAMPL host–guest challenges. *Silver* carbon, *Blue* nitrogen, *Red* oxygen. Hydrogen atoms were omitted for clarity

See this image and copyright information in PMC

References

1. Borhani DW, Shaw DE. The future of molecular dynamics simulations in drug discovery. J Comput Aided Mol Des. 2012;26:15–26. doi:10.1007/s10822-011-9517-y. - PMC - PubMed
1. Martin E, Ertl P, Hunt P, Duca J, Lewis R. Gazing into the crystal ball; The future of computer-aided drug design. J Comput Aided Mol Des. 2012;26:77–79. doi:10.1007/s10822-011-9487-0. - PubMed
1. Chen L, Morrow JK, Tran HT, Phatak SS, Du-Cuny L, Zhang S. From laptop to benchtop to bedside: structure-based drug design on protein targets. Curr Pharm Des. 2012;18:1217–1239. doi:10.2174/138920012799362837. - PMC - PubMed
1. Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov. 2004;3:935–949. doi:10.1038/nrd1549. - PubMed
1. Klebe G. Virtual ligand screening: strategies, perspectives and limitations. Drug Discov Today. 2006;11:580–594. doi:10.1016/j.drudis.2006.05.012. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Overview of the SAMPL5 host-guest challenge: Are we doing better?

Affiliations

Overview of the SAMPL5 host-guest challenge: Are we doing better?

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Chemical Information

Research Materials