Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Sep 30;8(9):e74335.
doi: 10.1371/journal.pone.0074335. eCollection 2013.

Lessons learned from quantitative dynamical modeling in systems biology

Affiliations
Comparative Study

Lessons learned from quantitative dynamical modeling in systems biology

Andreas Raue et al. PLoS One. .

Erratum in

  • PLoS One. 2013;8(12). doi:10.1371/annotation/ea0193d8-1f7f-492a-b0b7-d877629fdcee. Schelke, Max [corrected to Schelker, Max]

Abstract

Due to the high complexity of biological data it is difficult to disentangle cellular processes relying only on intuitive interpretation of measurements. A Systems Biology approach that combines quantitative experimental data with dynamic mathematical modeling promises to yield deeper insights into these processes. Nevertheless, with growing complexity and increasing amount of quantitative experimental data, building realistic and reliable mathematical models can become a challenging task: the quality of experimental data has to be assessed objectively, unknown model parameters need to be estimated from the experimental data, and numerical calculations need to be precise and efficient. Here, we discuss, compare and characterize the performance of computational methods throughout the process of quantitative dynamic modeling using two previously established examples, for which quantitative, dose- and time-resolved experimental data are available. In particular, we present an approach that allows to determine the quality of experimental data in an efficient, objective and automated manner. Using this approach data generated by different measurement techniques and even in single replicates can be reliably used for mathematical modeling. For the estimation of unknown model parameters, the performance of different optimization algorithms was compared systematically. Our results show that deterministic derivative-based optimization employing the sensitivity equations in combination with a multi-start strategy based on latin hypercube sampling outperforms the other methods by orders of magnitude in accuracy and speed. Finally, we investigated transformations that yield a more efficient parameterization of the model and therefore lead to a further enhancement in optimization performance. We provide a freely available open source software package that implements the algorithms and examples compared here.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: BH works for a commercial company (Merrimack Pharmaceuticals). This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

Figures

Figure 1
Figure 1. Quantitative dynamic models describing erythropoetin signaling used as examples.
The hormone erythropoietin (Epo) is the key regulator of erythropoiesis, the production of red blood cells. (a) Epo receptor model . The model describes the interaction and the trafficking of the hormone and of its membrane receptor (EpoR). The active complex Epo_EpoR can be internalized (Epo_EpoRformula image) and is either recycled back to the cell membrane or is degraded (dEpoformula image, dEpoformula image). (b) Model of Epo induced JAK2/STAT5 signaling . In erythroid progenitor cells (CFU-E), the hormone Epo induces activation of the tyrosine kinase Janus kinase 2 (JAK2). Subsequently, the signal transducer and activator of transcription 5 protein (STAT5) is activated and shuttles to the nucleus where it induces target gene expression. Two of the target genes encode for the negative feedback regulators suppressor of cytokine signaling 3 (SOCS3) and cytokine-inducible SH2-containing protein (CIS).
Figure 2
Figure 2. Data obtained by different experimental approaches, associated measurement noise and corresponding model simulation of the pathway dynamics.
Solid lines indicate the simulated model dynamics with the optimal parameter values. Gray shading indicates one standard deviation of the measurement noise that is associated with the respective measurement technique. (a,b) Data for Epo receptor model , 36 out of 85 total data points used for parameter estimation are displayed. (c,d,e,f) Data for JAK2/STAT5 model , 92 out of 541 total data points used for parameter estimation are displayed.
Figure 3
Figure 3. Acceleration of numerical computations for the solution of ODE systems by multithreading.
For each bar in the figure the 24 variants of the ODE systems used for the JAK2/STAT5 model were solved for 1000 randomly drawn sets of parameters using a Latin hypercube sampling strategy. The theoretically possible acceleration is displayed by the red dashed line.
Figure 4
Figure 4. Analysis of the bias and variance of parameter estimates induced by different methods for the assessment of measurement noise.
The figure shows the result of two hundred independent parameter estimation runs for one hundred different sets of simulated data for the parameters Aformula image, Bformula image, kformula image, kformula image and formula image, formula image where applicable. The amount of measurement noise was assessed by a pre-processing approach (a), by using a simultaneous estimation of dynamics and measurement noise (b) and using the true values for formula image and formula image in the estimation (c).
Figure 5
Figure 5. Performance analysis of parameter estimation using numerical optimization methods.
(a) A two dimensional parameter estimation problem bearing multiple optima (global: A; local: B,C,D) is displayed for illustrative purposes. Traces in parameter space of two hypothetical methods with high (blue) and low performance (red) are displayed. 50 independent runs with each method are displayed; the circles indicate the results of the estimation. (b) The visualization of optimization performance by sorting objective function values increasingly is also possible for high dimensional problems. It reveals that the performance of the red method is low, i.e. results are unreliable, whereas the performance of the blue method is high, i.e. results are reproducible and reliable. (c,d) Visualization of performance using 100 independent optimization runs with each of the considered algorithms for both quantitative dynamic models. For illustrative reasons, the global optimum was centered to one. For stochastic optimization (gray), 12 different algorithms were used (see Figure 11 for details). For deterministic optimization, two different approaches for the calculation of derivatives were compared: (i) finite difference approximation (red) and (ii) analytically derived sensitivity equations (orange and blue). Initial guesses for the parameters were generated by Latin hypercube sampling . All algorithms are described in Table 1.
Figure 6
Figure 6. Number of calls to the objective function for 100 independent optimization runs with each of the considered algorithms for the JAK2/STAT5 model.
For description of the algorithms, stochastic optimization (gray), deterministic optimization (red, blue) and hybrid optimization (green) see Table 1. The LSQNONLIN algorithm (MATLAB, R2011a, The Mathworks Inc., Natick, MA) was used in combination with Latin hyper cube sampling of the initial parameter guesses and using two different approaches for derivative calculation, finite difference approximation (FD) and the sensitivity equations (SE). LSQNONLIN using SE is the most efficient algorithm for parameter estimation considered in this study. For all algorithms default options were used. The maximum number of allowed function evaluations formula image was chosen such that the algorithm stops before reaching formula image in most cases. For stochastic optimization (gray) formula image, for deterministic optimization using SE (red) formula image and using FD (blue) formula image and for hybrid optimization (green) formula image.
Figure 7
Figure 7. Comparison of derivatives calculated by finite difference approximation and by the sensitivity equations.
The figure shows the calculation of derivatives formula image for the JAK2/STAT5 model for the dynamical variable pSTAT5 at formula image minutes with respect to the parameter CISInh on the y-axis, for several values of the parameter CISInh on the x-axis. For finite difference approximation, formula image (red), formula image (blue) and formula image (green) was used. The numerical errors rapidly become uncontrollable as formula image gets smaller. For larger formula image, finite difference approximation is of bad quality showing systematic errors. In contrast, the analytical derivatives calculated by the sensitivity equations produce stable and correct results (black).
Figure 8
Figure 8. Decoupling of parameters using concentration scale invariance.
The figure shows the dependency of the dynamics of the original system and of the reparameterized system on the parameter Aformula image. In case of the original system (top row) the parameter controls the scale of species A but also the time when the steady state is reached. In case of the reparameterized system (bottom row) the parameter controls only and exclusively the concentration scale of the entire system leaving the shape of the dynamics unaffected.
Figure 9
Figure 9. Decoupling of parameters using time scale invariance.
The figure shows the dependency of the dynamics of the original system and of the reparameterized system on the parameter kformula image and kformula image respectively. In case of the original system (top row) the parameter controls the level of the steady state but also the time when the steady state is reached. In case of the reparameterized system (bottom row) the parameter controls only and exclusively the time scale of the entire system leaving the shape of the dynamics, especially the steady state level, unaffected.
Figure 10
Figure 10. Comparison of random sampling and Latin hypercube sampling (LHS) for the generation of initial parameter guesses.
(a) One realization of twenty samples drawn randomly in a two dimensional parameter space is shown. (b) Twenty samples drawn by LHS. (c,d) Euclidean distance to the nearest neighbor parameter values for one thousand repeated generations of twenty samples in a 2D parameter space.
Figure 11
Figure 11. Comparison of optimization performance for stochastic optimization algorithms.
The figure shows 100 independent runs by each of the considered algorithms (Table 1) on the x-axis, sorted by the respective value of the objective function. For illustrative reasons, the y-axis was shifted by a constant (Figure 5b).

References

    1. Swameye I, Müller T, Timmer J, Sandra O, Klingmüller U (2003) Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling. Proceedings of the National Academy of Sciences 100: 1028–1033. - PMC - PubMed
    1. Becker V, Schilling M, Bachmann J, Baumann U, Raue A, et al. (2010) Covering a broad dynamic range: information processing at the erythropoietin receptor. Science 328: 1404–1408. - PubMed
    1. Wolkenhauer O (2008) Systems Biology. London, UK: Portland Press.
    1. Heinrich R, Schuster S (1996) The Regulation of Cellular Systems. London, UK: Chapman & Hall.
    1. Spencer S, Gaudet S, Albeck J, Burke J, Sorger P (2009) Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis. Nature 459: 428–432. - PMC - PubMed

Publication types