Large-Scale Docking in the Cloud
- PMID: 37071086
- PMCID: PMC10170500
- DOI: 10.1021/acs.jcim.3c00031
Large-Scale Docking in the Cloud
Abstract
Molecular docking is a pragmatic approach to exploit protein structures for new ligand discovery, but the growing size of available chemical space is increasingly challenging to screen on in-house computer clusters. We have therefore developed AWS-DOCK, a protocol for running UCSF DOCK in the AWS cloud. Our approach leverages the low cost and scalability of cloud resources combined with a low-molecule-cost docking engine to screen billions of molecules efficiently. We benchmarked our system by screening 50 million HAC 22 molecules against the DRD4 receptor with an average CPU time of around 1 s per molecule. We saw up to 3-fold variations in cost between AWS availability zones. Docking 4.5 billion lead-like molecules, a 7 week calculation on our 1000-core lab cluster, runs in about a week depending on accessible CPUs, in AWS for around $25,000, less than the cost of two new nodes. The cloud docking protocol is described in easy-to-follow steps and may be sufficiently general to be used for other docking programs. All the tools to enable AWS-DOCK are available free to everyone, while DOCK 3.8 is free for academic research.
Conflict of interest statement
The authors declare the following competing financial interest(s): J.J.I. is a co-founder of Blue Dolphin Lead Discovery LLC, a contract research organization for molecular docking screens. He is also a co-founder of Deep Apple Therapeutics, Inc.
Figures

Similar articles
-
Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening.J Chem Theory Comput. 2023 Jun 13;19(11):3336-3345. doi: 10.1021/acs.jctc.2c01145. Epub 2023 Apr 26. J Chem Theory Comput. 2023. PMID: 37125970
-
ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery.J Chem Inf Model. 2023 Feb 27;63(4):1166-1176. doi: 10.1021/acs.jcim.2c01253. Epub 2023 Feb 15. J Chem Inf Model. 2023. PMID: 36790087 Free PMC article.
-
DOCK 6: Incorporating hierarchical traversal through precomputed ligand conformations to enable large-scale docking.J Comput Chem. 2024 Jan 5;45(1):47-63. doi: 10.1002/jcc.27218. Epub 2023 Sep 25. J Comput Chem. 2024. PMID: 37743732
-
Docking optimization, variance and promiscuity for large-scale drug-like chemical space using high performance computing architectures.Drug Discov Today. 2016 Oct;21(10):1672-1680. doi: 10.1016/j.drudis.2016.06.023. Epub 2016 Jun 26. Drug Discov Today. 2016. PMID: 27352630 Review.
-
Advances in Docking.Curr Med Chem. 2019;26(42):7555-7580. doi: 10.2174/0929867325666180904115000. Curr Med Chem. 2019. PMID: 30182836 Review.
Cited by
-
Regression-Based Active Learning for Accessible Acceleration of Ultra-Large Library Docking.J Chem Inf Model. 2024 Apr 8;64(7):2612-2623. doi: 10.1021/acs.jcim.3c01661. Epub 2023 Dec 29. J Chem Inf Model. 2024. PMID: 38157481 Free PMC article.
-
DockOpt: A Tool for Automatic Optimization of Docking Models.J Chem Inf Model. 2024 Feb 12;64(3):1004-1016. doi: 10.1021/acs.jcim.3c01406. Epub 2024 Jan 11. J Chem Inf Model. 2024. PMID: 38206771 Free PMC article.
-
Identifying Artifacts from Large Library Docking.J Med Chem. 2024 Sep 26;67(18):16796-16806. doi: 10.1021/acs.jmedchem.4c01632. Epub 2024 Sep 10. J Med Chem. 2024. PMID: 39255340
-
Considerations Around Structure-Based Drug Discovery for KRAS Using DOCK.Methods Mol Biol. 2024;2797:67-90. doi: 10.1007/978-1-0716-3822-4_6. Methods Mol Biol. 2024. PMID: 38570453
-
Synthon-Based Strategies Exploiting Molecular Similarity and Protein-Ligand Interactions for Efficient Screening of Ultra-Large Chemical Libraries.J Chem Inf Model. 2025 Jul 28;65(14):7569-7583. doi: 10.1021/acs.jcim.5c00222. Epub 2025 Apr 28. J Chem Inf Model. 2025. PMID: 40294889
References
-
- Lyu J.; Wang S.; Balius T. E.; Singh I.; Levit A.; Moroz Y. S.; O’Meara M. J.; Che T.; Algaa E.; Tolmachova K.; Tolmachev A. A.; Shoichet B. K.; Roth B. L.; Irwin J. J. Ultra-Large Library Docking for Discovering New Chemotypes. Nature 2019, 566, 224–229. 10.1038/s41586-019-0917-9. - DOI - PMC - PubMed
-
- Stein R. M.; Kang H. J.; McCorvy J. D.; Glatfelter G. C.; Jones A. J.; Che T.; Slocum S.; Huang X. P.; Savych O.; Moroz Y. S.; Stauch B.; Johansson L. C.; Cherezov V.; Kenakin T.; Irwin J. J.; Shoichet B. K.; Roth B. L.; Dubocovich M. L. Virtual Discovery of Melatonin Receptor Ligands to Modulate Circadian Rhythms. Nature 2020, 579, 609–614. 10.1038/s41586-020-2027-0. - DOI - PMC - PubMed
-
- Gorgulla C.; Boeszoermenyi A.; Wang Z. F.; Fischer P. D.; Coote P. W.; Padmanabha Das K. M.; Malets Y. S.; Radchenko D. S.; Moroz Y. S.; Scott D. A.; Fackeldey K.; Hoffmann M.; Iavniuk I.; Wagner G.; Arthanari H. An Open-Source Drug Discovery Platform Enables Ultra-Large Virtual Screens. Nature 2020, 580, 663–668. 10.1038/s41586-020-2117-z. - DOI - PMC - PubMed
-
- Sadybekov A. A.; Brouillette R. L.; Marin E.; Sadybekov A. V.; Luginina A.; Gusach A.; Mishin A.; Besserer-Offroy E.; Longpre J. M.; Borshchevskiy V.; Cherezov V.; Sarret P.; Katritch V. Structure-Based Virtual Screening of Ultra-Large Library Yields Potent Antagonists for a Lipid Gpcr. Biomolecules 2020, 10, 1634.10.3390/biom10121634. - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous