Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 15;434(11):167514.
doi: 10.1016/j.jmb.2022.167514. Epub 2022 Feb 25.

PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data

Affiliations

PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data

Sunghwan Kim et al. J Mol Biol. .

Abstract

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database at the U.S. National Institutes of Health. Visited by millions of users every month, it plays a role as a key chemical information resource for biomedical research communities. Data in PubChem is from hundreds of contributors and organized into multiple collections by record type. Among these are the Protein, Gene, Pathway, and Taxonomy data collections. Records in these collections contain information on chemicals related to a given biological target (i.e., protein, gene, pathway, or taxon), helping users to analyze and interpret the biological activity data of molecules. In addition, annotations about the biological targets are collected from authoritative or curated data sources and integrated into the four collections. The content can be programmatically accessed through PubChem's web service interfaces (including PUG View). A machine-readable representation of this content is also provided within PubChemRDF.

Keywords: bioactivity; bioinformatics; cheminformatics; drug discovery; public chemical database.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Figure 1.
Figure 1.
Getting information on entities involved in the “Glycolysis and Gluconeogenesis” pathway from its Pathway Summary page (https://pubchem.ncbi.nlm.nih.gov/pathway/INOH:MI0035772). The Summary page shows lists of chemicals, proteins, and genes associated with the pathway. Clicking the items on the lists leads to their Compound, Protein, or Gene Summary pages, which provide more detailed information on them. The Pathway Summary page has a link to the corresponding record in the original data source, which helps users to get additional information on the pathway.
Figure 2.
Figure 2.
Searching PubChem using a text query, with “glycolysis” as an example. When a text query is provided in the search box (step ①), all data collections within PubChem are searched simultaneously and matching records found in each are returned together. Clicking the “Pathways” tab (step ②) shows the hits from the Pathway collection. The hit list can be refined or sorted by selected attributes (steps ③ and ④). The additional controls on the right column allow users to download the hit list, save it for later use, or get other records associated with the hits (step ⑤). Clicking one of the pathway records directs to its Summary page, which provides comprehensive information on the record (step ⑥).

References

    1. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE (2021). PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49, D1388–D1395. - PMC - PubMed
    1. Kim S (2016). Getting the most out of PubChem for virtual screening. Expert. Opin. Drug Discov 11, 843–855. - PMC - PubMed
    1. Kim S (2021). Exploring Chemical Information in PubChem. Curr. Protoc 1, e217. - PMC - PubMed
    1. Sayers EW, Beck J, Bolton EE, Bourexis D, Brister JR, Canese K, Comeau DC, Funk K, Kim S, Klimke W, Marchler-Bauer A, Landrum M, Lathrop S, Lu Z, Madden TL, O’Leary N, Phan L, Rangwala SH, Schneider VA, Skripchenko Y, Wang J, Ye J, Trawick BW, Pruitt KD, Sherry ST (2021). Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 49, D10–D17. - PMC - PubMed
    1. Kim S, Thiessen PA, Cheng T, Yu B, Shoemaker BA, Wang J, Bolton EE, Wang Y, Bryant SH (2016). Literature information in PubChem: associations between PubChem records and scientific articles. J. Cheminform 8, 32. - PMC - PubMed

Publication types

LinkOut - more resources