. 2017 Jan 16;13(1):907.

doi: 10.15252/msb.20167150.

Genomewide landscape of gene-metabolome associations in Escherichia coli

Tobias Fuhrer¹, Mattia Zampieri¹, Daniel C Sévin¹, Uwe Sauer², Nicola Zamboni¹

Affiliations

¹ Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
² Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland sauer@imsb.biol.ethz.ch.

PMID: 28093455
PMCID: PMC5293155
DOI: 10.15252/msb.20167150

Genomewide landscape of gene-metabolome associations in Escherichia coli

Tobias Fuhrer et al. Mol Syst Biol. 2017.

. 2017 Jan 16;13(1):907.

doi: 10.15252/msb.20167150.

Authors

Tobias Fuhrer¹, Mattia Zampieri¹, Daniel C Sévin¹, Uwe Sauer², Nicola Zamboni¹

Affiliations

¹ Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
² Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland sauer@imsb.biol.ethz.ch.

PMID: 28093455
PMCID: PMC5293155
DOI: 10.15252/msb.20167150

Abstract

Metabolism is one of the best-understood cellular processes whose network topology of enzymatic reactions is determined by an organism's genome. The influence of genes on metabolite levels, however, remains largely unknown, particularly for the many genes encoding non-enzymatic proteins. Serendipitously, genomewide association studies explore the relationship between genetic variants and metabolite levels, but a comprehensive interaction network has remained elusive even for the simplest single-celled organisms. Here, we systematically mapped the association between > 3,800 single-gene deletions in the bacterium Escherichia coli and relative concentrations of > 7,000 intracellular metabolite ions. Beyond expected metabolic changes in the proximity to abolished enzyme activities, the association map reveals a largely unknown landscape of gene-metabolite interactions that are not represented in metabolic models. Therefore, the map provides a unique resource for assessing the genetic basis of metabolic changes and conversely hypothesizing metabolic consequences of genetic alterations. We illustrate this by predicting metabolism-related functions of 72 so far not annotated genes and by identifying key genes mediating the cellular response to environmental perturbations.

Keywords: GWAS; functional genomics; interaction network; metabolism; metabolomics.

PubMed Disclaimer

Figures

**Figure EV1. Annotation coverage**
A, B
Metabolites putatively detected in central metabolism for (A) negative mode and (B) positive mode ionization. Green circles represent compounds specifically associated with a single detected ion within 3 mDa tolerance. Blue circles represent compounds that are ambiguously associated with an ion, that is, compounds whose mass was found to match a detected ion but has not a unique molecular weight in the metabolic model used for annotation. The dot size is proportional to the annotation confidence. Additional compounds not depicted on the amended network were also putatively identified.

**Figure 1. Gene–metabolite association matrix derived from metabolome analysis of single‐gene deletion mutants.**

**Figure EV2. Biological reproducibility of Z‐scores**
For each ion and deletion mutant, the absolute difference between Z‐scores of two biological replicates separately processed on different days was calculated. The overall distribution is shown in the histogram. Mean value (μ, red dashed line) and 99.9 confidence interval (μ + 3σ, green dashed line).
Growth‐rate dependence of metabolic changes. For each deletion mutant, the number of hits (largest 0.1% of metabolic changes) is plotted against the respective growth rate.

**Figure 2. Gene–metabolite associations in the 0.1% most significant associations ranked by z‐score, corresponding to an absolute z‐score > ~5**
Genes were classified according to the Cluster of Orthologous Groups. Annotated ions were grouped according to the genome‐scale metabolic model of *Escherichia coli* (Orth *et al*, 2011). Unknown ions were omitted. The ribbon width scales with the number of interactions.

**Figure EV3. Genes associated with the detectable intermediates of chorismate metabolism and the TCA cycle**
A, B
Genes associated with the detectable intermediates of chorismate metabolism (A) and the TCA cycle (B). To illustrate only high‐confidence links, we report only the edges associating a gene to deprotonated metabolites. Color code of metabolites and genes is as in Fig 2. Green and red edges represent increase and decrease in metabolite levels, respectively.

**Figure 3. Locality analysis for enzyme deletions**
Distribution of empirical P‐values (calculated from a permutation test) for enzyme deletions and the respective metabolites up to a distance of five enzymatic steps are plotted. In each enzyme‐deletion mutant, the modified z‐scores of metabolites at distance 1, 2, 3, 4 or 5 are compared to the average changes generated by selecting metabolites at random. For the five tested distances between enzyme and metabolites, the fraction of enzyme deletions yielding significant distance enriched metabolic changes are highlighted below the red line. For a substantial fraction of tested enzymes, the largest metabolic changes are observed within up to two enzymatic distance steps.

**Figure EV4. Enzymes with significant changes of metabolites in the immediate vicinity**
Enzymes with significant changes of metabolites in the immediate vicinity (e.g., distance 1 and different P‐value cutoff of 0.01 and 0.05, calculated from a permutation test) grouped by metabolic pathways according to the *Escherichia coli* metabolic model. For each pathway, the fraction of detected genes yielding largest changes in the metabolites on the immediate vicinity is indicated on the side of the bar chart plot.

**Figure EV5. Pathway enrichment analysis**
Statistical significance of the association between gene deletions (rows) and metabolic changes grouped by metabolic pathways (columns). Metabolic pathways in which gene deletions exhibit a significant (P‐value < 0.05, calculated from a permutation test) overrepresentation of strong altered metabolites are represented as a squared non‐symmetric matrix.

**Figure EV6. Locality analysis for enzymes, transcription factors and non‐metabolic proteins**
A–C
Locality analysis for genes encoding either enzymes, transcription factors, or protein of non‐metabolic function [e.g., ribosomes (Andres Leon *et al*, 2009)]. The distribution of the significance (i.e., P‐value, calculated from a permutation test) of the locality test for each of the tested genes is reported. Genes are grouped in three major classes: enzymes, transcription factors, and genes encoding for proteins that establish a physical interaction with at least one annotated enzyme (i.e., non‐metabolic genes). The gray region in the histogram highlights those gene deletions for which a significant local effect can be extrapolated (P‐value < 0.05). The overrepresentation of genes within this region supports a tendency for several knockouts to elicit local metabolic effects.

**Figure EV7. Distribution of metabolite changes**
Frequency of metabolite changes in gene knockout mutants in the set of 0.1% most significant changes, seemingly following a power‐law distribution.
Definition of boundaries for the classification of mutants according to the number of differential ions present in the top 0.1% percentile.
Classification of mutants based on number of detectable changes in metabolome.
Cellular function enrichment analysis. Enrichment significance (P‐value) was derived by hypergeometric probability density function. Only significant enrichments with a P‐value < 0.1 are highlighted.

**Figure 4. Network recovery for isoenzymes and protein complexes**
Recovery of enzyme function. Receiver operating characteristic curves obtained for the recovery of *Escherichia coli* isoenzymes and protein complexes based on the metabolome profiles recorded in single deletion mutants. The area under the curve (AUC) is reported in parentheses.
Consistent metabolic patterns in mutants of protein complex subunits. The heatmap shows the pair‐wise similarity (e.g., CLR index) between metabolome response to gene deletions. Genes related to densely connected protein complexes consisting of at least three subunits are selected. We visualized the protein complex adjacency matrix, opportunely reordered. Magnified protein complexes are 1, succinate dehydrogenase; 2, cytochrome bo terminal oxidase; 3, fumarate reductase/phosphate ABC transporter/dipeptide ABC transporter; 4, murein tripeptide ABC transporter; 5, ferric enterobactin transport complex/ferric dicitrate transport system; 6, NADH:ubiquinone oxidoreductase/Tol–Pal cell envelope complex and high‐scoring combinations thereof.

**Figure 5. Enrichment of metabolic functions for orphan genes**
Enrichment of metabolic functions (defined by Clusters of Orthologous Groups, COG) for each y‐gene based on genes of known function with similar metabolome profiles, as determined by CLR.
Mutually exclusive function prediction of orphan genes as either enzymes or transcription factors (TF). The inset represents the number of genes predicted to be TFs (yellow), enzymes (blue), or neither (gray). One gene was predicted to be both a TF and an enzyme.

**Figure EV8. Potential metabolic functions of YgfY, YidR and YidK**
Potential functions of YgfY: transcriptional, post‐translational, or complex‐related modulator of succinate dehydrogenase activity.
Succinate dehydrogenase activity assay in cell lysates of wild‐type strain, sdhC, and ygfY mutants. Fumarate formation from supplied succinate was followed over time by mass spectrometry. Data are shown as mean and standard deviation of three replicates.
Growth defect of ygfY mutant on succinate minimal medium in comparison with wild‐type strain and sdhC mutant. Data are shown as mean and standard deviation of two replicates.
Growth defect of yidR and yidK mutants on mineral salt medium with the indicated carbon source in comparison with wild‐type strain. Solid line indicates mean from at least two replicates.
Correlation of expression levels over all 907 experiments in the M3D database. R ² indicates goodness of a linear fit to the data and the strength of the correlation. FucR is a transcriptional activator of operons involved in fucose metabolism, and DgoT is a putative galactonate transporter.

**Figure EV9. Predicted genes mediating cellular response to environmental stimuli**
Dendrograms represent genes with significant overlap of differential metabolites between the respective knockouts and environmental perturbations. Genes are grouped on the basis of their topological distance by means of the minimum number of connecting reactions on the metabolic network.

**Figure 6. Predicting genes mediating the metabolic response to environmental perturbations**
Dendrogram representing genes with significant overlap of differential metabolites in the respective knockout and during growth in the presence of 10 mg/ml deoxycholate in wild‐type *Escherichia coli*. Genes are hierarchically clustered based on their topological distance assessed by the minimum number of connecting reactions in the metabolic network.
Relative growth rates of wild‐type *E. coli* and deletion mutants in glucose minimal medium supplemented with casein hydrolysate and deoxycholate. Error bars represent standard deviations from three biological replicates.
Relative growth rates of wild‐type *E. coli* and enterobactin biosynthesis mutants in glucose minimal medium supplemented with enterobactin and deoxycholate. Error bars represent standard deviations from three biological replicates.

**Figure EV10. Growth rates of *Escherichia coli* wild‐type strain grown under varying iron, enterobactin, and deoxycholate concentrations**
*Escherichia coli* wild‐type strain was grown on minimal medium casein hydrolysate with iron concentrations ranging from 0.05 up to 50 μM in the presence of 0, 1, 2, or 3 mg/ml deoxycholate. Relative growth rates were calculated during exponential growth and normalized to the 50 μM iron condition. Error bars represent standard deviations from three biological replicates.
*Escherichia coli* wild‐type strain and deletion mutants in enterobactin biosynthesis were grown on minimal medium without amino acids and with 1.5 μM (indicated with +) or without (indicated with −) enterobactin in the absence of deoxycholate. Maximum average growth rates with standard deviations during exponential growth phase were calculated from triplicate cultivations.

See this image and copyright information in PMC

References

1. Andres Leon E, Ezkurdia I, Garcia B, Valencia A, Juan D (2009) EcID. A database for the inference of functional interactions in E. coli . Nucleic Acids Res 37: D629–D635 - PMC - PubMed
1. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K‐12 in‐frame, single‐gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008 - PMC - PubMed
1. Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear ED, Sevier CS, Ding H, Koh JL, Toufighi K, Mostafavi S, Prinz J, St Onge RP, VanderSluis B, Makhnevych T, Vizeacoumar FJ, Alizadeh S, Bahr S, Brost RL, Chen Y, Cokol M et al (2010) The genetic landscape of a cell. Science 327: 425–431 - PMC - PubMed
1. Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS (2007) Large‐scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5: e8 - PMC - PubMed
1. Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO (2009) Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol 7: 129–143 - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- BioCyc
- Gene Ontology

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genomewide landscape of gene-metabolome associations in Escherichia coli

Affiliations

Genomewide landscape of gene-metabolome associations in Escherichia coli

Authors

Affiliations

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases