Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov 1;8(11):e79568.
doi: 10.1371/journal.pone.0079568. eCollection 2013.

SWEETLEAD: an in silico database of approved drugs, regulated chemicals, and herbal isolates for computer-aided drug discovery

Affiliations

SWEETLEAD: an in silico database of approved drugs, regulated chemicals, and herbal isolates for computer-aided drug discovery

Paul A Novick et al. PLoS One. .

Abstract

In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD (https://simtk.org/home/sweetlead) provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Effect of inaccurate ligand structural information on virtual screening performance.
A) and B) Chemical structures for indinavir as depicted by the OpenEye Scientific visualizations program VIDA. The 2D structures returned by ChemSpider (A – ChemSpider ID 4515036) and PubChem (B – PubChem ID 5362440) differ in the stereochemistry of a single chiral center, highlighted by the red circles. C) Effect of the differing structures of indinavir on docking results obtained by the OpenEye Scientific docking program FRED. Ten low energy conformers of each ligand were created with the OpenEye Scientific program Omega, and ligands were docked into the protein structure of HIV protease extracted from the indinavir-bound crystal structure PDB 2R5P. The best scoring pose for the PubChem (green carbons) and ChemSpider (orange carbons) ligands are shown in comparison to the crystallographic ligand (yellow carbons). While the correct structure, obtained from PubChem, scores in the top 6% of all approved drugs, the incorrect structure scores in the bottom 12%.
Figure 2
Figure 2. Workflow of the consensus building algorithm.
The described process of identifying a correct structure for a given drug begins with a drug or chemical name. In the first stage of the algorithm, the Data Collection stage, several databases are polled by the name and the database IDs linked to that name are retrieved and ranked by frequency each ID was returned (i.e., which ID is ‘most popular’ among databases polled). For each ID returned, the chemical structure associated with that ID is retrieved and standardized (salts removed, standard protonation states and aromaticity models, etc.). In the second stage, the Data Curation stage, the most popular structures from each database are compared. If all structures match, then the structure is assumed to be correct and is assigned to the drug name in the final SWEETLEAD database. If the structures do not match, an iterative cycling through the most popular structures for each database attempts to identify a consensus structure for the drug name. If a consensus or majority structure can not be identified, a manual review is undertaken. Finally, duplicate structures in SWEETLEAD are combined, to allow for numerous brand names and other identifiers for approved drugs.
Figure 3
Figure 3. Example outcomes of chemical names input into the SWEETLEAD workflow.
For the list of 1996 API names from the FDA orange book, the percentage of compounds is shown for which either a consensus structure, a majority vote structure, or no clear majority structure was identified via the SWEETLEAD algorithm. Of these drug names, a consensus or majority structure was determined for 91% of compounds.
Figure 4
Figure 4. ‘Drug-like’ properties of approved drugs vs. non-approved compounds in SWEETLEAD.
Comparison of molecular descriptors frequently referenced as important to drug-likeness between approved drugs and other compounds in the SWEETLEAD database. The property distributions for both the approved drugs and non-approved compounds in SWEETLEAD are shown for A) molecular weight, B) the number of rotatable bonds, C) the number of hydrogen bond donors and D) acceptors.

References

    1. Dimasi JA, Hansen RW, Grabowski HG (2003) The price of innovation: new estimates of drug development costs. J Health Econ 22: 151-185. doi:10.1016/S0167-6296(02)00126-1. PubMed: 12606142. - DOI - PubMed
    1. Morgan S, Grootendorst P, Lexchin J, Cunningham C, Greyson D (2011) The cost of drug development: A systematic review. Health Policy 100: 4-17. doi:10.1016/j.healthpol.2010.12.002. PubMed: 21256615. - DOI - PubMed
    1. Paul S, Mytelka D, Dunwiddie C, Persinger C, Munos B et al. (2010) How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat Rev Drug Discov: 12. - PubMed
    1. Basu M, Duckett JR (2007) Detrusor overactivity successfully treated with duloxetine. J Obstet Gynaecol 27: 438-440. PubMed: 17654212. - PubMed
    1. Boolell M, Allen MJ, Ballard SA, Gepi-Attee S, Muirhead GJ et al. (1996) Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. Int J Impot Res 8: 47-52. PubMed: 8858389. - PubMed

Publication types