Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Dec;18(12):731-743.
doi: 10.1038/s41579-020-00440-4. Epub 2020 Sep 21.

Reconstructing organisms in silico: genome-scale models and their emerging applications

Affiliations
Review

Reconstructing organisms in silico: genome-scale models and their emerging applications

Xin Fang et al. Nat Rev Microbiol. 2020 Dec.

Abstract

Escherichia coli is considered to be the best-known microorganism given the large number of published studies detailing its genes, its genome and the biochemical functions of its molecular components. This vast literature has been systematically assembled into a reconstruction of the biochemical reaction networks that underlie E. coli's functions, a process which is now being applied to an increasing number of microorganisms. Genome-scale reconstructed networks are organized and systematized knowledge bases that have multiple uses, including conversion into computational models that interpret and predict phenotypic states and the consequences of environmental and genetic perturbations. These genome-scale models (GEMs) now enable us to develop pan-genome analyses that provide mechanistic insights, detail the selection pressures on proteome allocation and address stress phenotypes. In this Review, we first discuss the overall development of GEMs and their applications. Next, we review the evolution of the most complete GEM that has been developed to date: the E. coli GEM. Finally, we explore three emerging areas in genome-scale modelling of microbial phenotypes: collections of strain-specific models, metabolic and macromolecular expression models, and simulation of stress responses.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare no conflict of interest

Figures

Figure 1:
Figure 1:
Basic principles of constraint-based modeling of cellular functions. a| Metabolic genes from annotated genomes of interest and metabolic knowledge lead to metabolic reactions. b| Integration of all the metabolic reactions through shared metabolites results in the construction of a metabolic network for the organism of interest. c| The metabolic network can be converted into a stoichiometric matrix (S matrix) where rows represent metabolites, columns represent reactions, and each entry represents the reaction coefficient of a particular metabolite in a reaction. d| With the S matrix and the objective function of the model, one can solve for the flux distributions. The solution space is where all possible solutions of flux distribution reside, and each axis represents the metabolic flux of a reaction. e| Applying additional constraints will shrink the allowable solution space. Commonly used constraints include the steady state assumption and feasible ranges of metabolic flux. f| One or multiple optimal solutions can be found in the allowable solution space that optimizes the objective function of the model (as represented by the red dot in the figure).
Figure 2:
Figure 2:
The increasing number of genome sequences and the development of genome-scale models. a| The number of public genome sequences in the PATRIC database. b| Number of reactions and metabolites represented in 108 manually curated models in the BiGG Models database. c| Multiple correspondence analysis (MCA) of the reactomes of the 108 reconstructions. d| Coverage of the 108 reconstructions in the tree of life. The number in parenthesis represents the number of reconstructions in each branch.
Figure 3:
Figure 3:
Historical development of Escherichia coli genome-scale models. Development of existing and potential future genome-scale models (both metabolic, shown in orange, and metabolic and macromolecular expression (ME) models shown in blue) of E. coli. The genome-scale metabolic model of E. coli first appeared in the early 2000s. An increasing scope of biological functions has been incorporated into the model, leading to various generations of the metabolic models as new discoveries were made. In the early 2010s, ME models that incorporate transcription and translation mechanisms emerged. Multiple efforts followed to improve and expand the ME model. Going into the 2020s, extensions of stress response modules have been added to ME models. Future directions involve incorporation of the sensome [G] to form the StressMe model, and the inclusion of toxins, biosynthetic gene clusters and cell cycle. Ovals indicate models, and boxes represent data incorporated to generate the models. According to the naming convention for network reconstructions, model names consist of an ‘i’ for in silico followed by the initials of the person(s) who built the model, and the number of open reading frames accounted for in the reconstruction.
Figure 4:
Figure 4:
Generation of strain-specific Escherichia coli genome-scale models and their application to multi-strain studies. Strain-specific models were generated from genome sequences of strains of interest and a curated reference model. The annotated genome sequences of target strains are mapped to the reference genome sequence to generate the homology matrix that delineates the gene sequence similarity across strains. The homology matrix can be used to create draft models of target strains. These models can then be finalized by manual curation. Strain-specific models were used to reveal variation in metabolic capabilities across different pathotypes, as illustrated in three studies shown on the right. The first multi-strain study of E. coli genome-scale models (GEMs) found metabolic capabilities predicted by GEMs correspond to pathotype and environment. In the second study, comparison of GEMs constructed for inflammatory bowel disease (IBD) clinical isolates suggested the possible link between metabolic functions of B2 strains and their prevalence in individuals with IBD. Lastly, GEMs of dominant strains in an individual with IBD revealed the potential correlation between metabolism and inflammation. Panel ‘Strains of different pathotypes’ adapted from ref. . Panel ‘IBD clinical isolates’ adapted from ref. . Panel ‘Dominating strains in IBD gut microbiome’ adapted from ref. .
Figure 5:
Figure 5:
General formulation of a metabolic and macromolecular expression model and its application to the study of stress response. Metabolic and macromolecular expression (ME) models are generated through the integration of M models and protein synthesis pathways including transcription, tRNA charging, and translation. Therefore, the ME model describes the biosynthesis of proteins and their roles in catalyzing the metabolic reactions. Stress-specific response mechanisms are integrated with the E. coli ME model to produce stress-specific ME models: FoldME, OxidizeME, and AcidifyME. FoldME models respond to temperature stress through the incorporation of chaperone-mediated (GroEL or DnaK) or spontaneous folding pathways. OxidizeME simulates the response to oxidative stress through the inclusion of oxidation and demetallation in metalloproteins, iron-sulfur cluster cofactor damage and repair, and DNA damage. AcidifyME models the mechanisms related to acid stress, including pH-dependent protein activity and stability, membrane composition, and intracellular buffering. CBO, cytochrome bo terminal oxidase; NDH-I, NADH dehydrogenase I; NDH-II, NADH dehydrogenase II; SDH, succinate dehydrogenase; Glu, glutamate; GABA, gamma-aminobutyric acid; NTPs, nucleoside triphosphates. FoldME panel adapted from ref. . OxidizeME panel adapted from ref. .

References

    1. Thiele I & Palsson BØ A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc 5, 93–121 (2010). - PMC - PubMed
    2. A detailed protocol to reconstruct a GEM from a genome sequence.

    1. Norsigian CJ, Fang X, Seif Y, Monk JM & Palsson BO A workflow for generating multi-strain genome-scale metabolic models of prokaryotes. Nat. Protoc (2019) doi:10.1038/s41596-019-0254-3. - DOI - PMC - PubMed
    2. A semi-automated workflow to generate strain-specific GEMs from a curated reference model.

    1. Price ND, Reed JL & Palsson BØ Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol 2, 886–897 (2004). - PubMed
    1. Lewis NE, Nagarajan H & Palsson BO Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nat. Rev. Microbiol (2012). - PMC - PubMed
    1. Kanehisa M & Goto S KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27–30 (2000). - PMC - PubMed

Publication types

MeSH terms