Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 1;39(1):btac793.
doi: 10.1093/bioinformatics/btac793.

Annotation of biologically relevant ligands in UniProtKB using ChEBI

Collaborators, Affiliations

Annotation of biologically relevant ligands in UniProtKB using ChEBI

Elisabeth Coudert et al. Bioinformatics. .

Abstract

Motivation: To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities of Biological Interest), to better support efforts to study and predict functionally relevant interactions between protein sequences and structures and small molecule ligands.

Results: We structured the data model for cognate ligand binding site annotations in UniProtKB and performed a complete reannotation of all cognate ligand binding sites using stable unique identifiers from ChEBI, which we now use as the reference vocabulary for all such annotations. We developed improved search and query facilities for cognate ligands in the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that ChEBI provides.

Availability and implementation: Binding site annotations for cognate ligands described using ChEBI are available for UniProtKB protein sequence records in several formats (text, XML and RDF) and are freely available to query and download through the UniProt website (www.uniprot.org), REST API (www.uniprot.org/help/api), SPARQL endpoint (sparql.uniprot.org/) and FTP site (https://ftp.uniprot.org/pub/databases/uniprot/).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Website view of small molecule annotations in UniProtKB, including cognate ligand binding site annotations—from www.uniprot.org/uniprotkb/P00175/entry. All small molecule annotations are shown in the ‘Function’ section. (A) The ‘Catalytic Activity’ subsection describes enzymatic reactions using Rhea (which is based on ChEBI), while cofactors are described using ChEBI in the ‘Cofactor’ subsection. Standardization of reaction and cofactor descriptions was performed in previous work, and is shown here for completeness. (B) The ‘Features’ subsection displays the available binding site annotations for cognate ligands described using ChEBI, the subject of this work. Each ligand has a link ‘UniProtKB’ to launch searches for other proteins binding this ligand, a link out to ‘ChEBI’, and expandable sections like ‘Publications’ to examine provenance and evidence

References

    1. Allot A. et al. (2021) LitSuggest: a web-based system for literature recommendation and curation using machine learning. Nucleic Acids Res., 49, W352–W358. - PMC - PubMed
    1. Armstrong D.R. et al. (2020) PDBe: improved findability of macromolecular structure data in the PDB. Nucleic Acids Res., 48, D335–D343. - PMC - PubMed
    1. Bansal P. et al. (2022) Rhea, the reaction knowledgebase in 2022. Nucleic Acids Res., 50, D693–D700. - PMC - PubMed
    1. Burley S.K. et al. (2021) RCSB protein data bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res., 49, D437–D451. - PMC - PubMed
    1. Cunane L.M. et al. (2002) Crystallographic study of the recombinant flavin-binding domain of baker's yeast flavocytochrome b(2): comparison with the intact wild-type enzyme. Biochemistry, 41, 4264–4272. - PubMed

Publication types