Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 29;19(9):e1011515.
doi: 10.1371/journal.pcbi.1011515. eCollection 2023 Sep.

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Affiliations

Profile-Wise Analysis: A profile likelihood-based workflow for identifiability analysis, estimation, and prediction with mechanistic mathematical models

Matthew J Simpson et al. PLoS Comput Biol. .

Abstract

Interpreting data using mechanistic mathematical models provides a foundation for discovery and decision-making in all areas of science and engineering. Developing mechanistic insight by combining mathematical models and experimental data is especially critical in mathematical biology as new data and new types of data are collected and reported. Key steps in using mechanistic mathematical models to interpret data include: (i) identifiability analysis; (ii) parameter estimation; and (iii) model prediction. Here we present a systematic, computationally-efficient workflow we call Profile-Wise Analysis (PWA) that addresses all three steps in a unified way. Recently-developed methods for constructing 'profile-wise' prediction intervals enable this workflow and provide the central linkage between different workflow components. These methods propagate profile-likelihood-based confidence sets for model parameters to predictions in a way that isolates how different parameter combinations affect model predictions. We show how to extend these profile-wise prediction intervals to two-dimensional interest parameters. We then demonstrate how to combine profile-wise prediction confidence sets to give an overall prediction confidence set that approximates the full likelihood-based prediction confidence set well. Our three case studies illustrate practical aspects of the workflow, focusing on ordinary differential equation (ODE) mechanistic models with both Gaussian and non-Gaussian noise models. While the case studies focus on ODE-based models, the workflow applies to other classes of mathematical models, including partial differential equations and simulation-based stochastic models. Open-source software on GitHub can be used to replicate the case studies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1
Profile-Wise Analysis: A profile likelihood-based workflow illustrating: (A) statistical framework; (B) estimation and prediction using the full likelihood function; and, (C) estimation and prediction using profile-wise, profile likelihood approximations.
Fig 2
Fig 2
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE (cyan) solution is superimposed, θ^=(0.0106,101.35,12.35). (B) Univariate profiles for λ, K and C(0), respectively. Each profile is superimposed with the MLE (vertical green) and the 95% threshold ^p=*=-Δ0.95,1/2 is shown (horizontal orange). Points of intersection of each profile and the threshold define asymptotic confidence intervals: λ^=0.0106[0.00656,0.0179]; K^=101.35[93.13,110.41]; and, C(0)^=12.35[2.95,25.33]. (C)-(E) C(t) trajectories associated with the λ, K and C(0) univariate profiles in (b), respectively. In each case we uniformly discretise the interest parameter using N = 100 points along the 95% confidence interval identified in (B), and solve the model forward with θ = (ψ, ω) and plot the family of solutions (grey), superimposing the MLE (cyan), and we identify the prediction interval defined by these solutions (solid red). (F) Compares approximate prediction intervals obtained by computing the union of the univariate profiles with the ‘exact’ prediction intervals computed from the full likelihood. Trajectories (grey) are constructed by considering N = 104 choices of θ and we plot solutions with ^*=-Δ0.95,3/2 only. These solutions define an ‘exact’ (or gold-standard) prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).
Fig 3
Fig 3
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE solution (cyan) is superimposed, θ^=(0.0106,101.35,12.35). (B)-(D) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. In (B)-(D) the left-most panel shows the bivariate profile uniformly discretised on a 20 × 20 grid with the ^p=*=-Δ0.95,2/2 contour and the MLE (pink disc) superimposed. The right-most panels in (B)-(D) show C(t) predictions (grey) for each grid point contained within the ^p=* contour, together with the profile-wise prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the ‘exact’ (more precisely, conservative, assuming the parameter confidence set has proper coverage) prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 104 choices of θ and we plot solutions with ^*-Δ0.95,3/2 only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).
Fig 4
Fig 4
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Gaussian noise with σ = 10. The MLE solution (cyan) is superimposed, θ^=(0.0106,101.35,12.35). (b)-(d) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. In (B)-(D) the left-most panel shows 200 points along the ^p=*=-Δ0.95,2/2 contours (blue discs) superimposed with the MLE (pink disc). The right-most panels shows 200 C(t) predictions (grey) that are associated with the 200 points along the ^p=* boundary, together with the profile-wise prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 104 choices of θ and we plot solutions with ^*-Δ0.95,3/2 only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).
Fig 5
Fig 5
(A) Data obtained by solving Eqs (25) and (26), with θ = (α, β, x(0), y(0)) = (0.9, 1.1, 0.8, 0.3), at t = 0, 0.5, 1.0, …, 10 is corrupted with Gaussian noise with σ = 0.2. The MLE solution (solid curves) is superimposed, θ^=(0.983,1.131,0.816,0.335). (B)-(G) Bivariate profiles for (α, β), (α, x(0)), (α, y(0)), (β, x(0)), (β, y(0)) and (x(0), y(0)), respectively. In (B)-(G) the left-most panel shows 100 points along the ^p=*=-Δ0.95,2/2 contour (blue discs) and the MLE (pink disc). The middle- and right-most panels show 100 predictions of a(t) and b(t), respectively, associated with the 100 points along the ^p=* contour, together with the profile-wise prediction interval (solid red). (H) Compares approximate prediction intervals obtained by computing the union of the profile-wise prediction intervals with the prediction intervals from the full likelihood function. Predictions (grey) are constructed by considering N = 104 choices of θ and we plot solutions with ^*-Δ0.95,4/2 only. These solutions define a gold-standard prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by the union of the three bivariate trajectories (red).
Fig 6
Fig 6
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Poisson noise. The MLE solution (cyan) is superimposed, θ^=(0.00854,101.85,12.313). (B) Univariate profiles for λ, K and C(0), respectively. Each profile is superimposed with the MLE (vertical green) and the 95% threshold ^p=*=-Δ0.95,1/2 is shown (horizontal orange). Points of intersection of each profile and the threshold define asymptotic confidence intervals: λ^=0.00854[0.00651,0.0110]; K^=101.85[93.29,111.62]; and, C(0)^=12.313[8.024,18.01]. (C)-(E) C(t) predictions associated with the λ, K and C(0) univariate profiles in (B), respectively. In each case we uniformly discretise the interest parameter using N = 100 points along the 95% confidence interval identified in (B), and solve the model forward with θ = (ψ, ω) and plot the family of solutions (grey), superimposing the MLE (cyan), and we identify the prediction interval defined by these solutions (solid red). (F) Compares approximate prediction intervals obtained by computing the union of the three univariate profiles with the prediction intervals computed from the full likelihood. Trajectories (grey) are constructed by considering N = 104 choices of θ and we plot solutions with * = −Δ0.95,3/2 only. These solutions define a prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).
Fig 7
Fig 7
(A) Data obtained by solving Eq (22), with θ = (λ, K, C(0)) = (0.01, 100, 10), at t = 0, 100, 200, …, 1000 is corrupted with Poisson noise. The MLE solution (cyan) is superimposed, θ^=(0.00854,101.85,12.313). (B)-(D) Bivariate profiles for (λ, K), (λ, C(0)) and (C(0), K), respectively. The left-most panel in (B)-(D) shows 200 points along the ^p=*=-Δ0.95,2/2 contour (blue discs) and the MLE (pink disc). The right-most panel in (B)-(D) shows 200 C(t) predictions associated with the points along the ^p=*=-Δ0.95,2/2 contour that define a prediction interval (solid red). (E) Compares approximate prediction intervals obtained by computing the union of the three bivariate profiles with the exact prediction interval computed from the full likelihood. Trajectories (grey) are constructed by considering N = 104 choices of θ and we plot solutions with ^*=-Δ0.95,3/2 only. These solutions define an exact prediction interval (dashed gold) that we compare with the MLE solution (cyan) and with the approximate prediction interval formed by taking the union of the three univariate intervals (red).

References

    1. Chatzilena A, van Leeuwen E, Ratmann O, Baguelin M, Demiris N. (2019). Contemporary statistical inference for infectious disease models using Stan. Epidemics, 29, 100367. doi: 10.1016/j.epidem.2019.100367 - DOI - PubMed
    1. Gabry J, Simpson D, Vehtari A, Betancourt M, Gelman A. 2019. Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(2).
    1. Gelman A. 2004. Exploratory data analysis for complex models. Journal of Computational and Graphical Statistics. 13, 755–779. doi: 10.1198/106186004X11435 - DOI
    1. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. 2013. Bayesian data analysis, third edition. Taylor & Francis.
    1. Gelman A, Vehtari A, Simpson D, Margossian CC, Carpenter B, Yao Y, Kennedy L, Gabry J, Bürkner PC, Modrák M. 2020. Bayesian workflow. arXiv preprint. (arxiv.org/abs/2011.01808).

Publication types