Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 Sep;85(9):1015-33.
doi: 10.1007/s00204-011-0705-2. Epub 2011 Apr 27.

A survey of metabolic databases emphasizing the MetaCyc family

Affiliations
Review

A survey of metabolic databases emphasizing the MetaCyc family

Peter D Karp et al. Arch Toxicol. 2011 Sep.

Abstract

Thanks to the confluence of genome sequencing and bioinformatics, the number of metabolic databases has expanded from a handful in the mid-1990s to several thousand today. These databases lie within distinct families that have common ancestry and common attributes. The main families are the MetaCyc, KEGG, Reactome, Model SEED, and BiGG families. We survey these database families, as well as important individual metabolic databases, including multiple human metabolic databases. The MetaCyc family is described in particular detail. It contains well over 1,000 databases, including highly curated databases for Escherichia coli, Saccharomyces cerevisiae, Mus musculus, and Arabidopsis thaliana. These databases are available through a number of web sites that offer a range of software tools for querying and visualizing metabolic networks. These web sites also provide multiple tools for analysis of gene expression and metabolomics data, including visualization of those datasets on metabolic network diagrams and over-representation analysis of gene sets and metabolite sets.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A typical MetaCyc pathway diagram. Commentary and other data that is included in the pathway page are not shown.
Figure 2
Figure 2
A MetaCyc superpathway. Superpathways are composed of several smaller pathways and are used to provide a more comprehensive view of a metabolic process. In this example, multiple pathways that relate to chorismate metabolism (e.g., chorismate biosynthesis, tetrahydrofolate biosynthesis, enterobactin biosynthesis) are integrated into a single diagram. Since superpathways can be very large, Pathway Tools automatically displays them at a lower detail level, trying to fit the full diagram on the screen. In this example, enzymes, genes, and even some of the metabolite intermediates are not displayed. The user can click the “More Detail” button at the top to increase the detail level incrementally, adding all intermediates, enzymes, and finally metabolite structures to the display.
Figure 3
Figure 3
A reaction that involves compound classes.
Figure 4
Figure 4
Searching HumanCyc for several monoisotopic molecular weights, with specified tolerance of 5 ppm. This type of search is useful for analysis of compounds identified by mass spectroscopy, enabling the researchers to find candidate compounds known to exist in the organism, and to learn about their roles in the metabolic network.
Figure 5
Figure 5
Query results. This figure shows the results of a search of the HumanCyc PGDB for proteins curated with the GO term 0006096 – glycolysis. The results are returned in a table, where each result is a hyperlink to the actual object. By clicking the triangles next to each column heading it is possible to sort the table according to the data in that column, in either ascending or descending order.
Figure 6
Figure 6
The Regulation Summary Diagram, which includes elements such as other genes in the same transcription unit, the sigma factor involved in transcription, the gene product and complexes formed by it, and different regulators that control transcription, translation, and activity. This example, which describes the trpA gene of Escherichia coli, includes the TrpR transcriptional regulator and the compound tryptophan (which also functions as a transcription regulator), a small RNA molecule that regulates translation of the mRNA, and the compound pyridoxal phosphate that activates the enzyme.
Figure 7
Figure 7
The Multi-Genome Browser makes it easy to notice even small differences among related genomic regions. In this example the genomic regions surrounding the ompW genes of several Escherichia coli strains are aligned.
Figure 8
Figure 8
The Cellular Overview. The figure shows the Cellular Overview for the cyanobacterium Synechococcus elongatus PCC 7942. Detailed description of the diagram is provided in the text. Several items have been highlighted on this diagram – the compound L-lysine (in green), peroxidase enzymes (in red), and genes whose name contain the substring “trp” (purple). The switchboard, to the right of the image, enables turning the individual highlighting operations on and off.
Figure 9
Figure 9
The Cellular Omics Viewer. This figure, showing a Cellular Omics Viewer for the bacterium Escherichia coli, depicts the overlay of a gene transcription dataset (Tao et al. 1999). The level of transcription is indicated by the color of the reactions that are catalyzed by the enzymes which are encoded by the specific genes. The legend for mapping colors to data values is not shown in the figure. By hovering the mouse cursor over a compound or a reaction the user can trigger pop-ups that provide information and enable navigation to the relevant compound page, or to a pathway display that retains the omics information (see Figure 10).
Figure 9
Figure 9
The Cellular Omics Viewer. This figure, showing a Cellular Omics Viewer for the bacterium Escherichia coli, depicts the overlay of a gene transcription dataset (Tao et al. 1999). The level of transcription is indicated by the color of the reactions that are catalyzed by the enzymes which are encoded by the specific genes. The legend for mapping colors to data values is not shown in the figure. By hovering the mouse cursor over a compound or a reaction the user can trigger pop-ups that provide information and enable navigation to the relevant compound page, or to a pathway display that retains the omics information (see Figure 10).
Figure 10
Figure 10
Omics data displayed on a pathway diagram. Several display options are shown, including an X-Y plot, histogram, and heat map.
Figure 11
Figure 11
Enrichment Analysis. In this example a group of genes was analyzed for enrichment for pathways. The results show that this group of genes was highly enriched for amino acids biosynthesis pathways, and specifically those for the biosynthesis of histidine, lysine, and proline.
Figure 12
Figure 12
Species comparison between Homo sapiens and Escherichia coli. Reactions shared by both organisms are highlighted in red.

References

    1. Aanensen DM, Mavroidi A, Bentley SD, Reeves PR, Spratt BG. Predicted functions and linkage specificities of the products of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol. 2007;189(21):7856–7876. - PMC - PubMed
    1. Bairoch A. The ENZYME database in 2000. Nucleic Acids Res. 2000;28(1):304–305. - PMC - PubMed
    1. Bernal V, Carinhas N, Yokomizo AY, Carrondo MJ, Alves PM. Cell density effect in the baculovirus-insect cells system: a quantitative analysis of energetic metabolism. Biotechnol Bioeng. 2009;104(1):162–180. - PubMed
    1. BioCyc webinars. SRI International; http://biocyc.org/webinar.shtml,
    1. Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Paley S, Popescu L, Pujar A, Shearer AG, Zhang P, Karp PD. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 2010;38(Database issue):D473–D479. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources