Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 12;11(1):48.
doi: 10.3390/metabo11010048.

Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB

Affiliations

Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB

Marc Feuermann et al. Metabolites. .

Abstract

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.

Keywords: RDF; SPARQL; biochemical reaction; biocuration; cheminformatics; enzyme; knowledge base; natural product; ontology; semantic web.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The major classes of natural products and representatives found in Chemical Entities of Biological Interest (ChEBI), Rhea, and UniProtKB. Examples used are ent-kaurene for terpenoids, morphine for alkaloids, steviolmonoside for glycosides, aflatoxin B1 for polyketides, beauvericin for non-ribosomal peptides (NRPs), and ustoloxin B for ribosomally synthesized and post-translationally modified peptides (RiPPs) starting with the precursor ribosomally synthesized cyclic peptide ustiloxin B precursor ustA (UniProtKB: B8NM66).
Figure 2
Figure 2
Enzyme annotation in UniProtKB/Swiss-Prot. The figure highlights the reaction catalyzed by the 6-methylsalicylic acid decarboxylase (patG) of Penicillium expansum (UniProtKB: A0A075TXZ1). The reaction is evidenced by a publication and EC 4.1.1.52 (enzyme class). Hover over the name of a reaction participant to display a tooltip allowing navigation between Rhea, ChEBI, and UniProt resources.
Figure 3
Figure 3
Curation of the patulin biosynthetic pathway in Penicillium expansum. The figure shows a schematic representation of the patulin biosynthesis pathway [28], which was fully curated in UniProtKB/Swiss-Prot. This pathway map was reconstructed using the 2D structures from ChEBI, the reactions provided by Rhea and their corresponding enzymes as annotated in UniProtKB (identifiers for each are indicated). UniProtKB/Swiss-Prot also provides additional data such as the subcellar location of each protein when known. The solid arrows indicate enzymatic reactions; the dashed arrows indicate transport reactions. The subcellular location of the enzymes illustrates the importance of compartmentalization for natural product biosynthesis [28]. All UniProtKB entry proteins involved in the patulin biosynthesis pathway can be retrieved using the following URL: www.uniprot.org/uniprot/?query=patulin&fil=organism%3A%22Penicillium+expansum+%28Blue+mold+rot+fungus%29+%5B27334%5D%22+AND+reviewed%3Ayes.
Figure 4
Figure 4
Sample query using the UniProt website advanced search tool to retrieve, in UniProtKB/Swiss-Prot, fungal oxidoreductases that metabolize malonyl-CoA, with published 3D structure(s). The query retrieves expert-curated (Field: Reviewed > Reviewed) oxidoreductases (Field: Gene Ontology [GO], Term: “oxidoreductase activity [16491]”) of fungal origin (Field: Taxonomy, Term: “Fungi [4751]”) metabolizing (Field: Function > Catalytic activity, Term: “inchikey: LTYOQGRJFJAKNA-DVVLENMVSA”) and for which protein 3D structure data are available (Field: Cross-references > 3D structure databases > PDB).
Figure 5
Figure 5
Graphical representation of the sample federated SPARQL query displayed in Figure S7 and its results. The query is performed at the UniProt SPARQL endpoint, which first “calls” the Rhea SPARQL endpoint, which itself “calls” the IDSM SPARQL endpoint. The actual compound similarity search (sachem:similaritySearch) is performed by the IDSM endpoint, which returns ChEBI compounds identical or similar to patulin to the Rhea endpoint. The Rhea endpoint then assembles a list of matching reactions and passes this list back to the UniProt endpoint, which finally maps the reactions to all possible enzymes and creates the desired result set of cognate chemicals, reactions, and enzymes. The results are available at tinyurl.com/sparql-uniprot.

References

    1. Tetali S.D. Terpenes and isoprenoids: A wealth of compounds for global use. Planta. 2019;249:1–8. doi: 10.1007/s00425-018-3056-x. - DOI - PubMed
    1. Mutlu-Ingok A., Devecioglu D., Dikmetas D.N., Karbancioglu-Guler F., Capanoglu E. Antibacterial, antifungal, antimycotoxigenic, and antioxidant activities of essential oils: An updated review. Molecules. 2020;25:4711. doi: 10.3390/molecules25204711. - DOI - PMC - PubMed
    1. Bills G.F., Gloer J.B. Biologically Active secondary metabolites from the fungi. Microbiol. Spectr. 2016;4 doi: 10.1128/microbiolspec.FUNK-0009-2016. - DOI - PubMed
    1. Cordell G.A. Fifty years of alkaloid biosynthesis in phytochemistry. Phytochemistry. 2013;91:29–51. doi: 10.1016/j.phytochem.2012.05.012. - DOI - PubMed
    1. Hayes M., Pietruszka J. Synthesis of glycosides by glycosynthases. Molecules. 2017;22:1434. doi: 10.3390/molecules22091434. - DOI - PMC - PubMed

LinkOut - more resources