. 2023 Sep 29;19(9):e1011515.

doi: 10.1371/journal.pcbi.1011515. eCollection 2023 Sep.

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Matthew J Simpson¹, Oliver J Maclaren²

Affiliations

¹ School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia.
² Department of Engineering Science, University of Auckland, Auckland, New Zealand.

PMID: 37773942
PMCID: PMC10566698
DOI: 10.1371/journal.pcbi.1011515

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Matthew J Simpson et al. PLoS Comput Biol. 2023.

. 2023 Sep 29;19(9):e1011515.

doi: 10.1371/journal.pcbi.1011515. eCollection 2023 Sep.

Authors

Matthew J Simpson¹, Oliver J Maclaren²

Affiliations

¹ School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia.
² Department of Engineering Science, University of Auckland, Auckland, New Zealand.

PMID: 37773942
PMCID: PMC10566698
DOI: 10.1371/journal.pcbi.1011515

Abstract

Interpreting data using mechanistic mathematical models provides a foundation for discovery and decision-making in all areas of science and engineering. Developing mechanistic insight by combining mathematical models and experimental data is especially critical in mathematical biology as new data and new types of data are collected and reported. Key steps in using mechanistic mathematical models to interpret data include: (i) identifiability analysis; (ii) parameter estimation; and (iii) model prediction. Here we present a systematic, computationally-efficient workflow we call Profile-Wise Analysis (PWA) that addresses all three steps in a unified way. Recently-developed methods for constructing 'profile-wise' prediction intervals enable this workflow and provide the central linkage between different workflow components. These methods propagate profile-likelihood-based confidence sets for model parameters to predictions in a way that isolates how different parameter combinations affect model predictions. We show how to extend these profile-wise prediction intervals to two-dimensional interest parameters. We then demonstrate how to combine profile-wise prediction confidence sets to give an overall prediction confidence set that approximates the full likelihood-based prediction confidence set well. Our three case studies illustrate practical aspects of the workflow, focusing on ordinary differential equation (ODE) mechanistic models with both Gaussian and non-Gaussian noise models. While the case studies focus on ODE-based models, the workflow applies to other classes of mathematical models, including partial differential equations and simulation-based stochastic models. Open-source software on GitHub can be used to replicate the case studies.

Copyright: © 2023 Simpson, Maclaren. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1**
Profile-Wise Analysis: A profile likelihood-based workflow illustrating: (A) statistical framework; (B) estimation and prediction using the full likelihood function; and, (C) estimation and prediction using profile-wise, profile likelihood approximations.

**Fig 2**
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE (cyan) solution is superimposed, $\hat{θ} = (0.0106, 101.35, 12.35)$ . (B) Univariate profiles for λ, K and C(0), respectively. Each profile is superimposed with the MLE (vertical green) and the 95% threshold ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 1} / 2$ is shown (horizontal orange). Points of intersection of each profile and the threshold define asymptotic confidence intervals: $\hat{λ} = 0.0106 [0.00656, 0.0179]$ ; $\hat{K} = 101.35 [93.13, 110.41]$ ; and, $\hat{C (0)} = 12.35 [2.95, 25.33]$ . (C)-(E) C(t) trajectories associated with the λ, K and C(0) univariate profiles in (b), respectively. In each case we uniformly discretise the interest parameter using N = 100 points along the 95% confidence interval identified in (B), and solve the model forward with θ = (ψ, ω) and plot the family of solutions (grey), superimposing the MLE (cyan), and we identify the prediction interval defined by these solutions (solid red). (F) Compares approximate prediction intervals obtained by computing the union of the univariate profiles with the ‘exact’ prediction intervals computed from the full likelihood. Trajectories (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with $\hat{ℓ} \geq ℓ^{*} = - Δ_{0.95, 3} / 2$ only. These solutions define an ‘exact’ (or gold-standard) prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).

**Fig 3**
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE solution (cyan) is superimposed, $\hat{θ} = (0.0106, 101.35, 12.35)$ . (B)-(D) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. In (B)-(D) the left-most panel shows the bivariate profile uniformly discretised on a 20 × 20 grid with the ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 2} / 2$ contour and the MLE (pink disc) superimposed. The right-most panels in (B)-(D) show C(t) predictions (grey) for each grid point contained within the ${\hat{ℓ}}_{p} = ℓ^{*}$ contour, together with the profile-wise prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the ‘exact’ (more precisely, conservative, assuming the parameter confidence set has proper coverage) prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with $\hat{ℓ} \geq ℓ^{*} - Δ_{0.95, 3} / 2$ only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).

**Fig 4**
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE solution (cyan) is superimposed, $\hat{θ} = (0.0106, 101.35, 12.35)$ . (b)-(d) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. In (B)-(D) the left-most panel shows 200 points along the ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 2} / 2$ contours (blue discs) superimposed with the MLE (pink disc). The right-most panels shows 200 C(t) predictions (grey) that are associated with the 200 points along the ${\hat{ℓ}}_{p} = ℓ^{*}$ boundary, together with the profile-wise prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with $\hat{ℓ} \geq ℓ^{*} - Δ_{0.95, 3} / 2$ only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).

**Fig 5**
(A) Data obtained by solving Eqs (25) and (26), with θ = (α, β, x(0), y(0)) = (0.9, 1.1, 0.8, 0.3), at t = 0, 0.5, 1.0, …, 10 is corrupted with Gaussian noise with σ = 0.2. The MLE solution (solid curves) is superimposed, $\hat{θ} = (0.983, 1.131, 0.816, 0.335)$ . (B)-(G) Bivariate profiles for (α, β), (α, x(0)), (α, y(0)), (β, x(0)), (β, y(0)) and (x(0), y(0)), respectively. In (B)-(G) the left-most panel shows 100 points along the ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 2} / 2$ contour (blue discs) and the MLE (pink disc). The middle- and right-most panels show 100 predictions of a(t) and b(t), respectively, associated with the 100 points along the ${\hat{ℓ}}_{p} = ℓ^{*}$ contour, together with the profile-wise prediction interval (solid red). (H) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with $\hat{ℓ} \geq ℓ^{*} - Δ_{0.95, 4} / 2$ only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).

**Fig 6**
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Poisson noise. The MLE solution (cyan) is superimposed, $\hat{θ} = (0.00854, 101.85, 12.313)$ . (B) Univariate profiles for λ, K and C(0), respectively. Each profile is superimposed with the MLE (vertical green) and the 95% threshold ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 1} / 2$ is shown (horizontal orange). Points of intersection of each profile and the threshold define asymptotic confidence intervals: $\hat{λ} = 0.00854 [0.00651, 0.0110]$ ; $\hat{K} = 101.85 [93.29, 111.62]$ ; and, $\hat{C (0)} = 12.313 [8.024, 18.01]$ . (C)-(E) C(t) predictions associated with the λ, K and C(0) univariate profiles in (B), respectively. In each case we uniformly discretise the interest parameter using N = 100 points along the 95% confidence interval identified in (B), and solve the model forward with θ = (ψ, ω) and plot the family of solutions (grey), superimposing the MLE (cyan), and we identify the prediction interval defined by these solutions (solid red). (F) Compares approximate prediction intervals obtained by computing the union of the three univariate profiles with the prediction intervals computed from the full likelihood. Trajectories (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with ℓ ≥ ℓ* = −Δ_0.95,3/2 only. These solutions define a prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).

**Fig 7**
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Poisson noise. The MLE solution (cyan) is superimposed, $\hat{θ} = (0.00854, 101.85, 12.313)$ . (B)-(D) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. The left-most panel in (B)-(D) shows 200 points along the ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 2} / 2$ contour (blue discs) and the MLE (pink disc). The right-most panel in (B)-(D) shows 200 C(t) predictions associated with the points along the ${\hat{ℓ}}_{p} = ℓ^{*} = - Δ_{0.95, 2} / 2$ contour that define a prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the three bivariate profiles with the exact prediction interval computed from the full likelihood. Trajectories (grey) are constructed by considering N = 10⁴ choices of θ and we plot solutions with $\hat{ℓ} \geq ℓ^{*} = - Δ_{0.95, 3} / 2$ only. These solutions define an exact prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).

See this image and copyright information in PMC

References

1. Chatzilena A, van Leeuwen E, Ratmann O, Baguelin M, Demiris N. (2019). Contemporary statistical inference for infectious disease models using Stan. Epidemics, 29, 100367. doi: 10.1016/j.epidem.2019.100367 - DOI - PubMed
1. Gabry J, Simpson D, Vehtari A, Betancourt M, Gelman A. 2019. Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(2).
1. Gelman A. 2004. Exploratory data analysis for complex models. Journal of Computational and Graphical Statistics. 13, 755–779. doi: 10.1198/106186004X11435 - DOI
1. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. 2013. Bayesian data analysis, third edition. Taylor & Francis.
1. Gelman A, Vehtari A, Simpson D, Margossian CC, Carpenter B, Yao Y, Kennedy L, Gabry J, Bürkner PC, Modrák M. 2020. Bayesian workflow. arXiv preprint. (arxiv.org/abs/2011.01808).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Affiliations

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous