Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;10(6):R63.
doi: 10.1186/gb-2009-10-6-r63. Epub 2009 Jun 12.

The conservation and evolutionary modularity of metabolism

Affiliations

The conservation and evolutionary modularity of metabolism

José M Peregrín-Alvarez et al. Genome Biol. 2009.

Abstract

Background: Cellular metabolism is a fundamental biological system consisting of myriads of enzymatic reactions that together fulfill the basic requirements of life. The recent availability of vast amounts of sequence data from diverse sets of organisms provides an opportunity to systematically examine metabolism from a comparative perspective. Here we supplement existing genome and protein resources with partial genome datasets derived from 193 eukaryotes to present a comprehensive survey of the conservation of metabolism across 26 taxa representing the three domains of life.

Results: In general, metabolic enzymes are highly conserved. However, organizing these enzymes within the context of functional pathways revealed a spectrum of conservation from those that are highly conserved (for example, carbohydrate, energy, amino acid and nucleotide metabolism enzymes) to those specific to individual taxa (for example, those involved in glycan metabolism and secondary metabolite pathways). Applying a novel co-conservation analysis, KEGG defined pathways did not generally display evolutionary coherence. Instead, such modularity appears restricted to smaller subsets of enzymes. Expanding analyses to a global metabolic network revealed a highly conserved, but nonetheless flexible, 'core' of enzymes largely involved in multiple reactions across different pathways. Enzymes and pathways associated with the periphery of this network were less well conserved and associated with taxon-specific innovations.

Conclusions: These findings point to an emerging picture in which a core of enzyme activities involving amino acid, energy, carbohydrate and lipid metabolism have evolved to provide the basic functions required for life. However, the precise complement of enzymes associated within this core for each species is flexible.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Representation of enzymes within three large-scale datasets. (a) Coverage of enzymes and genes provided by the three different datasets: the non-redundant protein database (nr); partial genomes; and complete genomes. Fifty percent of all enzymes are associated with approximately 15% of all partial genomes, approximately 60% of all complete genomes and approximately 75% of the nr categories used in this study. Compared to all the genes within the partial and complete genome datasets, the enzymes are more highly represented. (b) Relationships of enzyme coverage between the partial and complete genome datasets. Each point indicates a discrete enzyme (color indicates superclass membership - see inset key in (d)). Enzymes involved in secondary metabolism appear to be more highly represented in the partial genome datasets than the complete genome datasets. (c,d) As for (b) but showing the relationship of enzyme coverage between the nr dataset and the complete and partial genome datasets, respectively.
Figure 2
Figure 2
Heatmap showing the conservation of individual metabolic pathways. Each row indicates an individual metabolic pathway grouped by their superclass membership (defined by KEGG). Each column indicates specific taxonomic divisions (see Materials and methods; Table 1). Colored tiles indicate the level of conservation (percentage of enzymes detected) of each pathway within each taxonomic group (see inset color key top left). For example, many Glycan metabolic pathways are poorly conserved with the exception of several groups of metazoans.
Figure 3
Figure 3
Representative examples of pathways containing evolutionary submodules of enzymes. (a) Clustergrams showing the phylogenetic profiles of individual enzymes in three metabolic pathways. For each clustergram, rows indicate individual enzymes and columns indicate individual genomes. A grey box indicates that the enzyme has been detected in that genome, black boxes indicate that it has not. Hierarchical clustering was performed using Cluster3.0 [70] using Spearman rank correlation coefficients and average linkage. Colored boxes indicate manually assigned groups of enzymes with similar phylogenetic profiles. (b) KEGG pathway representations of the three clustered pathways presented in (a). Enzymes are colored by the groups derived from (a). Within each pathway, groups of similar colored enzymes can be located to specific areas of each pathway, suggesting an evolutionarily cohesive module of function. For example, in diterpenoid biosynthesis, the red cluster of enzymes form a spatially distinct section of the pathway connected to the orange cluster of enzymes, while the green cluster of enzymes appears to form the beginnings of the pathway.
Figure 4
Figure 4
Conservation within the global metabolic network. An integrated view of metabolism in which individual enzymes (1,329 nodes) are connected through common metabolites (5,906 edges) (see Materials and methods). Colors of nodes represent which metabolic superclass (as defined by KEGG) each enzyme belongs to (see inset key). Node size indicates the number of genomes (of 167 complete genomes) in which the enzyme could be detected. A number of pathways with connected enzymes are indicated with red circles for illustrative purposes. While some nodes such as those involved in diterpenoid biosynthesis - pathway 11 - form a separate network, the vast bulk of metabolic pathways form connections with many others (for example, Nitrogen metabolism - pathway 21).
Figure 5
Figure 5
Metabolic network properties. The graphs indicate the relationships between enzyme superclass categories, conservation and connection within the metabolic network. (a) Number of connections as a function of enzyme conservation. (b) Centrality (as measured by betweenness) of enzymes as a function of conservation. (c-e) Enzyme superclass and its conservation, connection and centrality properties.
Figure 6
Figure 6
Crosstalk between metabolic pathways. The network diagram represents the number of enzymes shared between pathways. Each pathway is represented by a node. Connections (edges) between these nodes represent the number of enzymes common to each pathway. Nodes are colored according to their superclass category; node size indicates the number of enzymes in that pathway; and thickness of edges indicate the number of enzymes common to each pathway (see inset keys).

References

    1. Caspi R, Foerster H, Fulcher CA, Hopkinson R, Ingraham J, Kaipa P, Krummenacker M, Paley S, Pick J, Rhee SY, Tissier C, Zhang P, Karp PD. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res. 2006;34:D511–516. doi: 10.1093/nar/gkj128. - DOI - PMC - PubMed
    1. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006;34:D354–357. doi: 10.1093/nar/gkj102. - DOI - PMC - PubMed
    1. Green ML, Karp PD. A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics. 2004;5:76. doi: 10.1186/1471-2105-5-76. - DOI - PMC - PubMed
    1. Ma H, Zeng AP. Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics. 2003;19:270–277. doi: 10.1093/bioinformatics/19.2.270. - DOI - PubMed
    1. Paley SM, Karp PD. Evaluation of computational metabolic-pathway predictions for Helicobacter pylori. Bioinformatics. 2002;18:715–724. doi: 10.1093/bioinformatics/18.5.715. - DOI - PubMed

Publication types

LinkOut - more resources