. 2016 Jan 27:9:5.

doi: 10.1186/s13040-016-0084-6. eCollection 2016.

The prediction accuracy of dynamic mixed-effects models in clustered data

Brian S Finkelman¹, Benjamin French², Stephen E Kimmel³

Affiliations

¹ Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Center for Therapeutic Effectiveness Research, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.
² Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.
³ Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Center for Therapeutic Effectiveness Research, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Department of Medicine, Cardiovascular Division, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.

PMID: 26819631
PMCID: PMC4728760
DOI: 10.1186/s13040-016-0084-6

The prediction accuracy of dynamic mixed-effects models in clustered data

Brian S Finkelman et al. BioData Min. 2016.

. 2016 Jan 27:9:5.

doi: 10.1186/s13040-016-0084-6. eCollection 2016.

Authors

Brian S Finkelman¹, Benjamin French², Stephen E Kimmel³

Affiliations

¹ Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Center for Therapeutic Effectiveness Research, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.
² Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.
³ Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Center for Therapeutic Effectiveness Research, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA ; Department of Medicine, Cardiovascular Division, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA.

PMID: 26819631
PMCID: PMC4728760
DOI: 10.1186/s13040-016-0084-6

Abstract

Background: Clinical prediction models often fail to generalize in the context of clustered data, because most models fail to account for heterogeneity in outcome values and covariate effects across clusters. Furthermore, standard approaches for modeling clustered data, including generalized linear mixed-effects models, would not be expected to provide accurate predictions in novel clusters, because such predictions are typically based on the hypothetical mean cluster. We hypothesized that dynamic mixed-effects models, which incorporate data from previous predictions to refine the model for future predictions, would allow for cluster-specific predictions in novel clusters as the model is updated over time, thus improving overall model generalizability.

Results: We quantified the potential gains in prediction accuracy from using a dynamic modeling strategy in a simulation study. Furthermore, because clinical prediction models in the context of clustered data often involve outcomes that are dependent on patient volume, we examined whether using dynamic mixed-effects models would be robust to misspecification of the volume-outcome relationship. Our results indicated that dynamic mixed-effects models led to substantial improvements in prediction accuracy in clustered populations over a broad range of conditions, and were uniformly superior to static models. In addition, dynamic mixed-effects models were particularly robust to misspecification of the volume-outcome relationship and to variation in the frequency of model updating. The extent of the improvement in prediction accuracy that was observed with dynamic mixed-effects models depended on the relative impact of fixed and random effects on the outcome as well as the degree of misspecification of model fixed effects.

Conclusions: Dynamic mixed-effects models led to substantial improvements in prediction model accuracy across a broad range of simulated conditions. Therefore, dynamic mixed-effects models could be a useful alternative to standard static models for improving the generalizability of clinical prediction models in the setting of clustered data, and, thus, well worth the logistical challenges that may accompany their implementation in practice.

Keywords: Bayesian statistics; Clustered data; Dynamic modeling; Generalizability; Mixed-effects models; Prediction.

PubMed Disclaimer

Figures

**Fig. 1**
Relative improvement in MAE for both dynamic and static models across all main parameter combinations. Plots show the density of values for relative improvement in MAE across 1,000 simulations, with horizontal bars representing the mean value. All other parameters are fixed at their base values. The center figure represents the base parameter combination

**Fig. 2**
Relative improvement in MAE by clinic-size quintile. Plots show the density of values for relative improvement in MAE across 1,000 simulations, with horizontal bars representing the mean value. These results are for the base parameter combination

**Fig. 3**
Effect of the update interval on the rate of improvement in prediction accuracy at a given clinic. This plot shows the mean relative improvement in MAE for prediction j at clinic i, across 1,000 simulations for different values of the update interval, θ. Vertical dashed and dotted lines indicate the point at which 80 % of the total gains in prediction accuracy have been achieved for the dynamic BLME model with a random intercept and the dynamic BLME model with a random intercept and random slope, respectively. Note that the base value of θ is 500, and all other parameters are fixed at their base values

**Fig. 4**
Effect of unknown patient-level predictor on model prediction accuracy. Plots show the density of values for relative improvement in MAE across 1,000 simulations, with horizontal bars representing the mean value, for different values of β ₂, which controls the size of the effect of the unknown patient-level predictor, X _2ij, to the outcome, Y _ij. Note that the relative contribution of X _2ij to the total variance in Y _ij, compared to X _1ij, is equal to β ₂². All other parameters are fixed at their base values

**Fig. 5**
Effect of an association between clinic size and the outcome on model prediction accuracy. Plots show the density of values for relative improvement in MAE across 1,000 simulations, with horizontal bars representing the mean value, for different values of γ, which controls the size of the effect of scaled clinic size, f(N _i), on the outcome, Y _ij. Note that the relative contribution of f(N _i) to the total variance in Y _ij, compared to X _1ij, is equal to γ ². All other parameters are fixed at their base values

**Fig. 6**
Effect of including clinic-size quintile as a fixed effect on prediction model accuracy. Plots show the density of values for relative improvement in MAE across 1,000 simulations, with horizontal bars representing the mean value, for different values of γ, which controls the size of the effect of scaled clinic size, f(N _i), on the outcome, Y _ij. All models include clinic-size quintile, N _i^*, as a categorical fixed effect, because N _i^* was defined to be observed while N _i was defined to be unobserved. Note that the relative contribution of f(N _i) to the total variance in Y _ij, compared to X _1ij, is equal to γ ². All other parameters are fixed at their base values

**Fig. 7**
Computational time of static and dynamic models. The mean computational times in seconds for the static models are shown by the open circles. From bottom to top, the circles represent the linear model, the BLME model with a random intercept, and the BLME model with a random intercept and a random slope. The mean computational times for the dynamic linear model, BLME model with a random intercept, and BLME model with a random intercept and a random slope, are shown by the solid, dashed, and dotted lines, respectively. All parameters are fixed at their base values

See this image and copyright information in PMC

Cited by

A Comparison of Methods to Detect Changes in Prediction Models.
Schnellinger EM, Yang W, Harhay MO, Kimmel SE. Schnellinger EM, et al. Methods Inf Med. 2022 May;61(1-02):19-28. doi: 10.1055/s-0042-1742672. Epub 2022 Feb 12. Methods Inf Med. 2022. PMID: 35151231 Free PMC article.
Multi-task deep autoencoder to predict Alzheimer's disease progression using temporal DNA methylation data in peripheral blood.
Chen L, Saykin AJ, Yao B, Zhao F; Alzheimer’s Disease Neuroimaging Initiative (ADNI). Chen L, et al. Comput Struct Biotechnol J. 2022 Oct 23;20:5761-5774. doi: 10.1016/j.csbj.2022.10.016. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 36756173 Free PMC article.
Prediction models for clustered data with informative priors for the random effects: a simulation study.
Ni H, Groenwold RHH, Nielen M, Klugkist I. Ni H, et al. BMC Med Res Methodol. 2018 Aug 6;18(1):83. doi: 10.1186/s12874-018-0543-5. BMC Med Res Methodol. 2018. PMID: 30081875 Free PMC article.
Predicting prolonged dose titration in patients starting warfarin.
Finkelman BS, French B, Bershaw L, Brensinger CM, Streiff MB, Epstein AE, Kimmel SE. Finkelman BS, et al. Pharmacoepidemiol Drug Saf. 2016 Nov;25(11):1228-1235. doi: 10.1002/pds.4069. Epub 2016 Jul 26. Pharmacoepidemiol Drug Saf. 2016. PMID: 27456080 Free PMC article.
Developing more generalizable prediction models from pooled studies and large clustered data sets.
de Jong VMT, Moons KGM, Eijkemans MJC, Riley RD, Debray TPA. de Jong VMT, et al. Stat Med. 2021 Jul 10;40(15):3533-3559. doi: 10.1002/sim.8981. Epub 2021 May 5. Stat Med. 2021. PMID: 33948970 Free PMC article.

See all "Cited by" articles

References

1. Yap C-H, Reid C, Yii M, Rowland MA, Mohajeri M, Skillington PD, et al. Validation of the EuroSCORE model in Australia. Eur J Cardiothorac Surg. 2006;29(4):441–6. doi: 10.1016/j.ejcts.2005.12.046. - DOI - PubMed
1. Hickey GL, Grant SW, Caiado C, Kendall S, Dunning J, Poullis M, et al. Dynamic prediction modeling approaches for cardiac surgery. Circ Cardiovasc Qual Outcomes. 2013;6(6):649–58. doi: 10.1161/CIRCOUTCOMES.111.000012. - DOI - PubMed
1. Oudega R. The Wells Rule Does Not Adequately Rule Out Deep Venous Thrombosis in Primary Care Patients. Ann Intern Med. 2005;143(2):100–7. doi: 10.7326/0003-4819-143-2-200507190-00008. - DOI - PubMed
1. Ross JS, Mulvey GK, Stauffer B, Patlolla V, Bernheim SM, Keenan PS, et al. Statistical models and patient predictors of readmission for heart failure: a systematic review. Arch Intern Med. 2008;168(13):1371–86. doi: 10.1001/archinte.168.13.1371. - DOI - PubMed
1. Schootman M, Lian M, Pruitt SL, Hendren S, Mutch M, Deshpande AD, et al. Hospital and geographic variability in two colorectal cancer surgery outcomes: complications and mortality after complications. Ann Surg Oncol. 2014;21(8):2659–66. doi: 10.1245/s10434-013-3472-x. - DOI - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The prediction accuracy of dynamic mixed-effects models in clustered data

Affiliations

The prediction accuracy of dynamic mixed-effects models in clustered data

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources