Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database

Mariusz Butkiewicz¹, Edward W Lowe Jr, Ralf Mueller, Jeffrey L Mendenhall, Pedro L Teixeira, C David Weaver, Jens Meiler

Affiliations

Affiliation

¹ Department of Chemistry, Pharmacology, and Biomedical Informatics, Center for Structural Biology, Institute of Chemical Biology, Vanderbilt University, Nashville, TN 37232, USA.

PMID: 23299552
PMCID: PMC3759399
DOI: 10.3390/molecules18010735

Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database

Mariusz Butkiewicz et al. Molecules. 2013.

. 2013 Jan 8;18(1):735-56.

doi: 10.3390/molecules18010735.

Authors

Mariusz Butkiewicz¹, Edward W Lowe Jr, Ralf Mueller, Jeffrey L Mendenhall, Pedro L Teixeira, C David Weaver, Jens Meiler

Affiliation

¹ Department of Chemistry, Pharmacology, and Biomedical Informatics, Center for Structural Biology, Institute of Chemical Biology, Vanderbilt University, Nashville, TN 37232, USA.

PMID: 23299552
PMCID: PMC3759399
DOI: 10.3390/molecules18010735

Abstract

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.

PubMed Disclaimer

References

1. Geppert H., Vogt M., Bajorath J.R. Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation. J. Chem. Inf. Model. 2010;50:205–216. doi: 10.1021/ci900419k. - DOI - PubMed
1. Austin C.P., Brady L.S., Insel T.R., Collins F.S. NIH Molecular Libraries Initiative. Science. 2004;306:1138–1139. doi: 10.1126/science.1105511. - DOI - PubMed
1. Bajorath J. Integration of virtual and high-throughput screening. Nat. Rev. Drug Discov. 2002;1:882–894. doi: 10.1038/nrd941. - DOI - PubMed
1. PubChem Home Page. [(accessed on 26 November 2012)]. Available online: http://pubchem.ncbi.nlm.nih.gov/
1. Handen J.S. The industrialization of drug discovery. Drug Discov. Today. 2002;7:83–85. doi: 10.1016/S1359-6446(01)02099-2. - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database

Affiliation

Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database

Authors

Affiliation

Abstract

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous