Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar;14(3):639-702.
doi: 10.1038/s41596-018-0098-2.

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0

Affiliations

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0

Laurent Heirendt et al. Nat Protoc. 2019 Mar.

Abstract

Constraint-based reconstruction and analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental molecular systems biology data and quantitative prediction of physicochemically and biochemically feasible phenotypic states. The COBRA Toolbox is a comprehensive desktop software suite of interoperable COBRA methods. It has found widespread application in biology, biomedicine, and biotechnology because its functions can be flexibly combined to implement tailored COBRA protocols for any biochemical network. This protocol is an update to the COBRA Toolbox v.1.0 and v.2.0. Version 3.0 includes new methods for quality-controlled reconstruction, modeling, topological analysis, strain and experimental design, and network visualization, as well as network integration of chemoinformatic, metabolomic, transcriptomic, proteomic, and thermochemical data. New multi-lingual code integration also enables an expansion in COBRA application scope via high-precision, high-performance, and nonlinear numerical optimization solvers for multi-scale, multi-cellular, and reaction kinetic modeling, respectively. This protocol provides an overview of all these new features and can be adapted to generate and analyze constraint-based models in a wide variety of scenarios. The COBRA Toolbox v.3.0 provides an unparalleled depth of COBRA methods.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS The authors declare that they have no competing financial interests.

Figures

Figure 1:
Figure 1:
Overview of key constraint-based reconstruction and analysis concepts. a. A genome-scale metabolic reconstruction is a structured knowledge-base that abstracts pertinent information on the biochemical transformations taking place within a chosen biochemical system, e.g., the human gut microbiome. Genome-scale metabolic reconstructions are built in two steps. First, several platforms exist for the generation of a draft metabolic reconstruction based on genome annotations. Second, the draft reconstructions need to be refined based on known experimental and biochemical data from literature. Novel experiments can be performed on the organism and the reconstruction refined accordingly. b. A phenotypically feasible solution space is defined by specifying certain assumptions, e.g., a steady-state assumption, then converting the reconstruction into computational model that eliminates physicochemically or biochemically infeasible network states. Various methods are used to interrogate the solution space. For example, optimisation for a biologically motivated objective function (e.g. biomass production) identifies a single optimal flux vector, whereas uniform sampling provides an unbiased characterisation via flux vectors uniformly distributed in the solution space. c Flux balance analysis is an optimization method that maximizes a linear objective function, ψ(υ)=cTυ, formed by multiplying every reaction flux υj with a predetermined coefficient, cj, subject to a steady state assumption, Sυ = 0, as well as lower and upper bounds on each reaction flux (lbj and ubj, respectively).
Figure 2:
Figure 2:
Continuous integration of new code (submitted by developers) is performed on a dedicated server running Jenkins (https://jenkins.io). The main code is located in the src folder and tests functions in the test folder. A test not only runs a function (first degree testing), but tests the output of that function (second degree testing). The continuous integration setup relies on end-of-year releases of MATLAB only. Soon after the latest stable version of MATLAB is released, full support will be provided for the COBRA Toolbox. After a successful run of tests on the three latest end-of-year releases of MATLAB using various solver packages, the documentation based on the headers of the functions (docstrings) is extracted, generated, and automatically deployed. Immediate feedback through code coverage reports (https://codecov.io/gh/opencobra/cobratoolbox) and build statuses are reported on GitHub. With this setup, the impact of local changes in the code base is promptly revealed.
Figure 3:
Figure 3:
Conceptual overview of the main steps involved in the unsteady-state flux balance analysis (uFBA) method.
Figure 4:
Figure 4:
Solution spaces from steady state fluxes are anisotropic, that is, long in some directions and short in others. This impedes the ability of any sampling algorithm taking a random direction to evenly explore the full feasible set (artificial centering hit-and-run (ACHR) algorithm). The CHRR (coordinate hit-and-run with rounding) algorithm first rounds the solution space based on the maximum volume ellipsoid. Then, the rounded solution space is uniformly sampled using a provably efficient coordinate hit-and-run random walk. Finally, the samples are projected back onto the anisotropic feasible set. This leads to a more distributed uniform sampling, so that the converged sampling distributions for the selected reactions become smoother. As an example, for both sampling distributions, the parameters were defined as: nSkip = 8 × (dim(fluxSpace)), nSarnples = 1000.
Figure 5:
Figure 5:
In the OptForce procedure, the MUST sets are determined by contrasting the flux ranges obtained using flux variability analysis (FVA) of a wild-type (blue bars) and an overproducing strain (red bars). The first order MUST sets (top panel) are denoted MUSTL and MUSTU. For instance, a reaction belongs to the MUSTU set if the upper bound of the flux range in the wild-type is less than the lower bound of the flux range of the overproducing strain. The center and bottom panels show all possible second order MUST sets.
Figure 6:
Figure 6:
The openCOBRA repository and the fork of a contributor located on the Github server can be cloned to the local computer as cobratoolbox and fork-cobratoolbox folders, respectively. Each repository might contain different branches, but each repository contains the master and develop branches. Note that contributors only have read on the openCOBRA repository. The stable branch is the master branch (black branch), while the development of code is made on the develop branch (green branch). The master branch shall be checked out when using the cobratoolbox repository, whereas contributors shall create new branches originating from the develop branch (local fork-cobratoolbox directory and online <username>/cobratoolbox repository). In the present example, myBranch1 (blue branch) has already been pushed to the forked repository on the Github server, while myBranch2 (pink branch) is only present locally. The branch myBranch1 may be merged into the develop branch of the openCOBRA repository through opening a pull request. In order to submit the contributions (commits) on myBranch2, the contributor must first push the commits to the forked repository (https://github.com/<username>/cobratoolbox) before opening a pull request. Any commit made on the develop branch (red square) will be merged to the master branch if the develop branch is stable overall (orange square).
Figure 7:
Figure 7:
Output of initialisation of the COBRA Toolbox with initCobraToolbox.
Figure 8:
Figure 8:. An energy generating stoichiometrically balanced cycle.
The smallest stoichiometrically balanced cycle that produces ATP at a maximal rate using the ATP synthase reaction, in Recon3D, with all internal reactions. All metabolite and reaction abbreviations are primary keys in the Virtual Metabolic Human database (https://vmh.uni.lu): reaction abbreviation, reaction name: ADK1m, adenylate kinase, mitochondrial; G5SDym, glutamate-5-semialdehyde dehydrogenase, mitochondrial; GLU5Km, glutamate 5-kinase, mitochondrial; P45027A15m, 5-beta-cytochrome P450, family 27, subfamily A, polypeptide 1; PPAm, inorganic diphosphatase; r0074, L-glutamate 5-semialdehyde:NAD+ oxidoreductase; HMR_3966, nucleoside-triphosphate giphosphatase; ATPS4mi, ATP synthase (four protons for one ATP); CYOR_u10mi, ubiquinol-6 cytochrome c reductase, Complex III; NADH2_u10mi, NADH dehydrogenase, mitochondrial; CYOOm2i, cytochrome c oxidase, mitochondrial complex IV.
Figure 9:
Figure 9:
The interventions predicted by the OptForce method for succinate overproduction in E. coli (AntCore model) under aerobic conditions. Reactions that need to be up-regulated (green arrows and labels) and knocked out (red arrows and labels) are shown in this simplified metabolic map. The strategies include up-regulation of reactions generating succinate such as isocitrate dehydrogenase (R2), α-ketoglutarate dehydrogenase or succinyl-CoA synthetase, along with knockout of reactions draining succinate such as succinate dehydrogenase or fumarate hydratase. Note that each of these reactions may associate with one or more genes in E. coli.
Figure 10:
Figure 10:. Qualitatively forward, quantitatively reverse reactions in a multi-compartmental, genome-scale model.
In Recon3D, the transformed reaction Gibbs energy could be estimated for 7, 215 reactions. Of these reactions, 2, 868 reactions were qualitatively assigned to be forward in the reconstruction, but were quantitatively assigned to be reversible using subcellular compartment specific thermodynamic parameters, the component contribution method, and broad bounds on metabolite concentrations (10−5 – 0.02 mol/L), except for certain cofactors. The geometric mean (green) and feasible range (between maximum and minimum) of estimated millimolar standard transformed reaction Gibbs energy (Ar G’m, blue) and transformed reaction Gibbs energy (ΔrG′m, red) are illustrated. The relative uncertainty in metabolite concentrations versus uncertainty in thermochemical estimates is reflected by the relative breadth of the red and blue bars for each reaction, respectively. The reactions are rank ordered by the cumulative probability that millimolar standard transformed reaction Gibbs energy is less than zero, PrG′m < 0), (black descending line from left to right). This assumes that all metabolites are at a millimolar concentration (1mM) and a Gaussian error is assumed in component contribution estimates. In this ordering, forward transport reactions have PrG′m < 0) = 1 (far left) and reverse transport reactions have PrG′m < 0) = 0 (far right). In between, from left to right are biochemical reactions with decreasing cumulative probability of being forward in direction, subject to the stated assumptions. Alternative rankings are possible. The key point is to observe that the COBRA Toolbox is primed for quantitative integration of metabolomic data as the uncertainty in transformed reaction Gibbs energy associated with thermochemical estimates using the component contribution method is now significantly lower than the uncertainty associated with the assumption of broad concentration range.
Figure 11:
Figure 11:
Overlay of the flux vector for maximum ATP synthase flux, using flux balance analysis with regularisation of the flux vector. Active fluxes are highlighted (purple).
Figure 12:
Figure 12:
Selective scope visualisation of the E. coli core model model by Paint4Net. Rectangles represent reactions with rates of fluxes in brackets; the red rectangles represent reactions with only one metabolite; ellipses represent metabolites; the red ellipses represent dead end metabolites; grey edges represent zero-rate fluxes; green edges represent positive-rate (forward) fluxes; and blue edges represent negative-rate (backward) fluxes. Network visualisation also enables zoom to specific regions.

References

    1. Palsson BØ Systems Biology: Constraint-based Reconstruction and Analysis. Cambridge University Press, Cambridge, England, January (2015).
    1. O’Brien EJ, Monk JM, and Palsson BO Using Genome-scale Models to Predict Biological Capabilities. Cell 161(5), 971–987, May (2015). - PMC - PubMed
    1. Becker SA, Feist AM, Mo ML, Hannum G, Palsson BØ, et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat. Protocols 2(3), 727–738, March (2007). - PubMed
    1. Schellenberger J, Que R, Fleming RMT, Thiele I, Orth JD, et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat. Protocols 6(9), 1290–1307, September (2011). - PMC - PubMed
    1. Lewis NE, Nagarajan H, and Palsson BO Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nature Reviews Microbiology 10, February (2012). - PMC - PubMed

Publication types