Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 5;52(D1):D1180-D1192.
doi: 10.1093/nar/gkad1004.

The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods

Affiliations

The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods

Barbara Zdrazil et al. Nucleic Acids Res. .

Abstract

ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for ∼270 000 bioactivity measurements.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
(A) Graph on the left shows the distribution of bioactivities in ChEMBL from different document types (‘PUBLICATION’, ‘DATASET’) over time (release v09–v33). Document types ‘PATENT’ and ‘BOOK’ have not been included in this graph since these are only assigned to a small portion of bioactivities. (B) Graph on the right shows the increase in the number of documents of document type ‘DATASET’ in ChEMBL over time (ChEMBL 24–CHEMBL 33).
Figure 2.
Figure 2.
Data in ChEMBL covers all stages of the drug discovery pipeline.
Figure 3.
Figure 3.
Graphs showing the share of assay types (top panel), target types (central panel), and organism classes (bottom panel) in different ChEMBL releases (v24–v33) for publications (left panel), and for deposited datasets (right panel).
Figure 4.
Figure 4.
Bar plot showing the numbers of drugs associated with each molecule type across different releases of ChEMBL.
Figure 5.
Figure 5.
Left: word cloud of all abstracts of papers in PubChem mentioning ChEMBL within the last 5 years; Right: time trends of papers mentioning ChEMBL together with other drug discovery related terms within the past 10 years.
Figure 6.
Figure 6.
Target predictions based on conformal prediction models available via the ChEMBL web interface. The example shows predictions for imatinib (CHEMBL941).
Figure 7.
Figure 7.
Image of the Search by IDs dedicated menu.

References

    1. Drysdale R., Cook C.E., Petryszak R., Baillie-Gerritsen V., Barlow M., Gasteiger E., Gruhl F., Haas J., Lanfear J., Lopez R.et al.. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences. Bioinformatics. 2020; 36:2636–2642. - PMC - PubMed
    1. Leeson P.D., Bento A.P., Gaulton A., Hersey A., Manners E.J., Radoux C.J., Leach A.R.. Target-based evaluation of ‘drug-like’ properties and ligand efficiencies. J. Med. Chem. 2021; 64:7210–7230. - PMC - PubMed
    1. Bouhaddou M., Memon D., Meyer B., White K.M., Rezelj V.V., Correa Marrero M., Polacco B.J., Melnyk J.E., Ulferts S., Kaake R.M.et al.. The Global Phosphorylation Landscape of SARS-CoV-2 Infection. Cell. 2020; 182:685–712. - PMC - PubMed
    1. Gaziano L., Giambartolomei C., Pereira A.C., Gaulton A., Posner D.C., Swanson S.A., Ho Y.-L., Iyengar S.K., Kosik N.M., Vujkovic M.et al.. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat. Med. 2021; 27:668–676. - PMC - PubMed
    1. Rasooly D., Peloso G.M., Pereira A.C., Dashti H., Giambartolomei C., Wheeler E., Aung N., Ferolito B.R., Pietzner M., Farber-Eger E.H.et al.. Genome-wide association analysis and Mendelian randomization proteomics identify drug targets for heart failure. Nat. Commun. 2023; 14:3826. - PMC - PubMed