. 2007;8(2):R26.

doi: 10.1186/gb-2007-8-2-r26.

A network perspective on the evolution of metabolism by gene duplication

Juan Javier Díaz-Mejía¹, Ernesto Pérez-Rueda, Lorenzo Segovia

Affiliations

Affiliation

¹ Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Av, Universidad 2001, Col, Chamilpa, Cuernavaca, Morelos, CP 62210 México. jdime@ibt.unam.mx

PMID: 17326820
PMCID: PMC1852415
DOI: 10.1186/gb-2007-8-2-r26

A network perspective on the evolution of metabolism by gene duplication

Juan Javier Díaz-Mejía et al. Genome Biol. 2007.

. 2007;8(2):R26.

doi: 10.1186/gb-2007-8-2-r26.

Authors

Juan Javier Díaz-Mejía¹, Ernesto Pérez-Rueda, Lorenzo Segovia

Affiliation

¹ Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Av, Universidad 2001, Col, Chamilpa, Cuernavaca, Morelos, CP 62210 México. jdime@ibt.unam.mx

PMID: 17326820
PMCID: PMC1852415
DOI: 10.1186/gb-2007-8-2-r26

Abstract

Background: Gene duplication followed by divergence is one of the main sources of metabolic versatility. The patchwork and stepwise models of metabolic evolution help us to understand these processes, but their assumptions are relatively simplistic. We used a network-based approach to determine the influence of metabolic constraints on the retention of duplicated genes.

Results: We detected duplicated genes by looking for enzymes sharing homologous domains and uncovered an increased retention of duplicates for enzymes catalyzing consecutive reactions, as illustrated by the ligases acting in the biosynthesis of peptidoglycan. As a consequence, metabolic networks show a high retention of duplicates within functional modules, and we found a preferential biochemical coupling of reactions that partially explains this bias. A similar situation was found in enzyme-enzyme interaction networks, but not in interaction networks of non-enzymatic proteins or gene transcriptional regulatory networks, suggesting that the retention of duplicates results from the biochemical rules governing substrate-enzyme-product relationships. We confirmed a high retention of duplicates between chemically similar reactions, as illustrated by fatty-acid metabolism. The retention of duplicates between chemically dissimilar reactions is, however, also greater than expected by chance. Finally, we detected a significant retention of duplicates as groups, instead of single pairs.

Conclusion: Our results indicate that in silico modeling of the origin and evolution of metabolism is improved by the inclusion of specific functional constraints, such as the preferential biochemical coupling of reactions. We suggest that the stepwise and patchwork models are not independent of each other: in fact, the network perspective enables us to reconcile and combine these models.

PubMed Disclaimer

Figures

**Figure 1**
Preferential biochemical coupling of reactions in metabolic networks. **(a)** Homologous transferases PurF and Gpt from *E. coli* catalyze consecutive chemically similar reactions. Their origin can be explained by both the stepwise and the patchwork models. **(b)** Homologous ligases involved in peptidoglycan biosynthesis whose origin can be explained by both the stepwise and the patchwork models. A distant homolog (FolC) acts in folate metabolism. **(c)** Frequencies of reaction types (EC:a.b) in the *E. coli* K12 metabolic network, according to KEGG (hereafter called EcoKegg). **(d)** Frequencies of consecutive reaction types (EC:a.b → EC:w.x) in EcoKegg were compared against the expected values using a set of null Maslov-Sneppen models (see Materials and methods). The Z-score (color-scale bar at top) indicates the number of standard deviations between the real and the average expected frequencies. Consecutive reaction types overrepresented in real networks are shown in green-to-yellow, underrepresented ones are shown in red. The diagonal (pink box) highlights consecutive chemically similar reactions, including the ligases synthesizing peptidoglycan (pink arrow). Reaction types were sorted vertically using a hierarchical clustering to detect highly related reaction types, such as EC:1.5, EC:1.7 and EC:2.1. (center of plot).

**Figure 2**
Influence of chemical similarity and distance on the retention of duplicates. **(a)** Frequencies of retained duplicates (histogram bars) in EcoKegg are shown for the whole reaction set (ALL), and the subsets of chemically similar reactions (CSRs) and chemically different reactions (CDRs) at different distances (metabolic steps). Blue bars indicate three standard deviations (σ) from these frequencies. Deviations were obtained by random sampling. Red dots represent the average expected frequencies ± 3σ obtained using Maslov-Sneppen models. The rewiring to construct the null model is shown below the graph. **(b)** A similar procedure to (a) was carried out, using null functionally similar models to control the influence of the preferential biochemical coupling of reactions. Symbols as in (a). Compared with Maslov-Sneppen models, in which all nodes are equally eligible for change, in functionally similar models the preferential biochemical coupling of reactions restricts the choices. **(c)** Retention of duplicates in the gene regulatory network of *E. coli* as a function of the distance (number of regulatory interactions) between transcription factors and target genes. **(d)** Retention of duplicates in a protein-protein interaction network of *E. coli*. The full set of interactions (ALL), and the subsets of enzyme-enzyme (EC-EC) and non-enzymatic protein-protein (P-P) interactions are shown. In (c) and (d) red dots represent averages obtained using Maslov-Sneppen models.

**Figure 3**
Influence of network modularity on the retention of duplicates. **(a)** A hierarchical clustering was carried out to delimit modules in metabolic networks. Colors denote different modules in EcoKegg. **(b)** Metabolic pathways (branches in the trees) within and across modules were compared using a measure of evolutionary distance (ED). Modules comprising related branches are indicated by color as in (a). A value of (ED) closer to zero (the darker squares) implies a greater retention of duplicates between the two given pathways. **(c)** Observed (ED) values were compared against those expected by chance - after random shuffling of protein-domains. A Z-score < -3 (green) refers to significant (ED) values (P < 0.001).

**Figure 4**
Retention of duplicates as groups and single entities. **(a)** The fatty-acid degradative and biosynthetic routes illustrate the retention of duplicates as groups. The same colors in EC number boxes denote duplicates. **(b)** Retention of duplicates acting consecutively. Five hypothetical scenarios were analyzed (left panel). Boxes of the same color denote duplicates. The number and letter (for example, E2 and E2') indicate the place of the reaction in the series. Scenarios (I) and (V) have a common reaction followed or preceded by two possible reactions. In (I) gene duplication was detected, in (V) it was not. Scenarios (II), (III) and (IV) involve pairs of consecutive reactions in two branches of the network. In (II) both pairs are duplicates, in (III) only one pair is duplicated, and in (IV) none of the pairs are duplicates. From this diagram one can see that one pair can participate in more than one scenario, looking upstream or downstream in the network flux. The histogram on the right shows the frequency for each scenario. We present the results for the four databases analyzed herein. The networks were reconstructed eliminating the top 20 hubs. These results are the comparison of all-against-all pairs (EC:a.b → EC:w.x), including CSRs as well as CDRs. Red dots represent the expected average frequencies ± 3σ obtained using Maslov-Sneppen models.

See this image and copyright information in PMC

Cited by

An integrative approach for measuring semantic similarities using gene ontology.
Peng J, Li H, Jiang Q, Wang Y, Chen J. Peng J, et al. BMC Syst Biol. 2014;8 Suppl 5(Suppl 5):S8. doi: 10.1186/1752-0509-8-S5-S8. Epub 2014 Dec 12. BMC Syst Biol. 2014. PMID: 25559943 Free PMC article.
The Escherichia coli phosphotyrosine proteome relates to core pathways and virulence.
Hansen AM, Chaerkady R, Sharma J, Díaz-Mejía JJ, Tyagi N, Renuse S, Jacob HK, Pinto SM, Sahasrabuddhe NA, Kim MS, Delanghe B, Srinivasan N, Emili A, Kaper JB, Pandey A. Hansen AM, et al. PLoS Pathog. 2013;9(6):e1003403. doi: 10.1371/journal.ppat.1003403. Epub 2013 Jun 13. PLoS Pathog. 2013. PMID: 23785281 Free PMC article.
Correlation between structure and temperature in prokaryotic metabolic networks.
Takemoto K, Nacher JC, Akutsu T. Takemoto K, et al. BMC Bioinformatics. 2007 Aug 21;8:303. doi: 10.1186/1471-2105-8-303. BMC Bioinformatics. 2007. PMID: 17711568 Free PMC article.
The Role of Gene Duplication in the Divergence of Enzyme Function: A Comparative Approach.
Álvarez-Lugo A, Becerra A. Álvarez-Lugo A, et al. Front Genet. 2021 Jul 14;12:641817. doi: 10.3389/fgene.2021.641817. eCollection 2021. Front Genet. 2021. PMID: 34335678 Free PMC article.
The Semi-Enzymatic Origin of Metabolic Pathways: Inferring a Very Early Stage of the Evolution of Life.
Becerra A. Becerra A. J Mol Evol. 2021 Apr;89(3):183-188. doi: 10.1007/s00239-021-09994-0. Epub 2021 Jan 28. J Mol Evol. 2021. PMID: 33506330

See all "Cited by" articles

References

1. Schuster S, Fell DA, Dandekar T. A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat Biotechnol. 2000;18:326–332. doi: 10.1038/73786. - DOI - PubMed
1. Wagner A, Fell DA. The small world inside large metabolic networks. Proc Biol Sci. 2001;268:1803–1810. doi: 10.1098/rspb.2001.1711. - DOI - PMC - PubMed
1. Jensen RA. Enzyme recruitment in the evolution of new function. Annu Rev Microbiol. 1976;30:409–425. doi: 10.1146/annurev.mi.30.100176.002205. - DOI - PubMed
1. von Mering C, Zdobnov EM, Tsoka S, Ciccarelli FD, Pereira-Leal JB, Ouzounis CA, Bork P. Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci USA. 2003;100:15428–15433. doi: 10.1073/pnas.2136809100. - DOI - PMC - PubMed
1. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL. Hierarchical organization of modularity in metabolic networks. Science. 2002;297:1551–1555. doi: 10.1126/science.1073374. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A network perspective on the evolution of metabolism by gene duplication

Affiliation

A network perspective on the evolution of metabolism by gene duplication

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Research Materials