This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2024 Nov 14:arXiv:2411.09820v1.

WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking

Yunchao Lance Liu¹, Ha Dong², Xin Wang¹, Rocco Moretti³, Yu Wang⁴, Zhaoqian Su⁵, Jiawei Gu⁶, Bobby Bodenheimer^{1

7

8}, Charles David Weaver^{3

9}, Jens Meiler^{3

10

11

12

13

14}, Tyler Derr^{1

5}

Affiliations

¹ Computer Science Dept., Vanderbilt University (VU).
² Neural Science Dept., Amherst College.
³ Chemistry Dept., VU.
⁴ Computer Science Dept., University of Oregon.
⁵ Data Science Institute, VU.
⁶ MD Anderson Cancer Center.
⁷ Electrical and Computer Engineering Dept" VU.
⁸ Psychology Dept., VU.
⁹ Institute of Chemical Biology, VU.
¹⁰ Center for Structural Biology, VU.
¹¹ Pharmacology Dept., VU.
¹² Institute for Drug Discovery, Leipzig University (LU).
¹³ Computer Science Dept., LU.
¹⁴ Chemistry Dept., LU.

PMID: 39606732
PMCID: PMC11601797

WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking

Yunchao Lance Liu et al. ArXiv. 2024.

[Preprint]. 2024 Nov 14:arXiv:2411.09820v1.

Authors

Affiliations

¹ Computer Science Dept., Vanderbilt University (VU).
² Neural Science Dept., Amherst College.
³ Chemistry Dept., VU.
⁴ Computer Science Dept., University of Oregon.
⁵ Data Science Institute, VU.
⁶ MD Anderson Cancer Center.
⁷ Electrical and Computer Engineering Dept" VU.
⁸ Psychology Dept., VU.
⁹ Institute of Chemical Biology, VU.
¹⁰ Center for Structural Biology, VU.
¹¹ Pharmacology Dept., VU.
¹² Institute for Drug Discovery, Leipzig University (LU).
¹³ Computer Science Dept., LU.
¹⁴ Chemistry Dept., LU.

PMID: 39606732
PMCID: PMC11601797

Abstract

While deep learning has revolutionized computer-aided drug discovery, the AI community has predominantly focused on model innovation and placed less emphasis on establishing best benchmarking practices. We posit that without a sound model evaluation framework, the AI community's efforts cannot reach their full potential, thereby slowing the progress and transfer of innovation into real-world drug discovery. Thus, in this paper, we seek to establish a new gold standard for small molecule drug discovery benchmarking, WelQrate. Specifically, our contributions are threefold: WelQrate Dataset Collection - we introduce a meticulously curated collection of 9 datasets spanning 5 therapeutic target classes. Our hierarchical curation pipelines, designed by drug discovery experts, go beyond the primary high-throughput screen by leveraging additional confirmatory and counter screens along with rigorous domain-driven preprocessing, such as Pan-Assay Interference Compounds (PAINS) filtering, to ensure the high-quality data in the datasets; WelQrate Evaluation Framework - we propose a standardized model evaluation framework considering high-quality datasets, featurization, 3D conformation generation, evaluation metrics, and data splits, which provides a reliable benchmarking for drug discovery experts conducting real-world virtual screening; Benchmarking - we evaluate model performance through various research questions using the WelQrate dataset collection, exploring the effects of different models, dataset quality, featurization methods, and data splitting strategies on the results. In summary, we recommend adopting our proposed WelQrate as the gold standard in small molecule drug discovery benchmarking. The WelQrate dataset collection, along with the curation codes, and experimental scripts are all publicly available at WelQrate.org.

PubMed Disclaimer

Figures

**Fig. 1:**
An overview of the data curation pipeline.

**Fig. 2:**
An example of the hierarchical curation with AID 1798. Initially 63,676 compounds go through a primary screen (AID 626). The found 1,665 actives further go through a confirmatory screen (AID 1488) to verify their activities, and those showing activity in a counter screen (AID 1741) are excluded from the final active set.

**Fig. 3:**
Illustration of the adapted cross-valiation.

**Fig. 4:**
Categorical performance comparison among different models (RQ1) trained respectively with *WelQrate* dataset collection and control dataset (RQ2) (Note that individual model performances are shown in Fig. 6). Values are averages over performance across different datasets. Error bars denote standard error across multiple experimental runs and AIDs. For simplicity, *WelQrate* refers to *WelQrate* dataset collection in the legend.

**Fig. 5:**
Comparison of model performance using one-hot encoding and pre-defined features in *WelQrate* dataset collection (RQ3). Error bars denote standard error across multiple experimental runs.

**Fig. 6:**
Comparison of model performance under random and scaffold split (RQ4). Error bars denote standard error across multiple experimental runs and AIDs.

See this image and copyright information in PMC

References

1. Wognum Cas, Ash Jeremy R, Aldeghi Matteo, Rodríguez-Pérez Raquel, Fang Cheng, Cheng Alan C, Price Daniel J, Clevert Djork-Arné, Engkvist Ola, and Walters W Patrick. A call for an industry-led initiative to critically assess machine learning for real-world drug discovery. Nature Machine Intelligence, pages 1–2, 2024.
1. Sliwoski Gregory, Kothiwale Sandeepkumar, Meiler Jens, and Lowe Edward W. Computational methods in drug discovery. Pharmacological reviews, 66(1):334–395, 2014. - PMC - PubMed
1. Leelananda Sumudu P and Lindert Steffen. Computational methods in drug discovery. Beilstein journal of organic chemistry, 12(1):2694–2718, 2016. - PMC - PubMed
1. Wu Zhenqin, Ramsundar Bharath, Feinberg Evan N, Gomes Joseph, Geniesse Caleb, Pappu Aneesh S, Leswing Karl, and Pande Vijay. Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018. - PMC - PubMed
1. Walters Pat. We Need Better Benchmarks for Machine Learning in Drug Discovery. https://practicalcheminformatics.blogspot.com/2023/08/we-need-better-ben..., 2023. [Online; accessed 05-June-2024].

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking

Affiliations

WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking

Authors

Affiliations

Abstract

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources