Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 10:12:137.
doi: 10.1186/1471-2288-12-137.

Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint

Affiliations

Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint

Christine Adrion et al. BMC Med Res Methodol. .

Abstract

Background: A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP.

Methods: We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score).

Results: The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial.

Conclusions: The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Trajectory plots for vertigo data. Effect of betahistine-dihydrochloride on the frequency of attacks of vertigo in a total of 112 Menière’s disease patients; 2 treatment groups: “low-dosage” (50 patients) vs. “high-dosage” (62 patients). A) individual trajectories for vertigo data. B) and C) display the conditional posterior mean trajectories of the number of attacks depending upon fixed and random effects after fitting a Poisson GLMM (I: model with random intercepts. IS: model with random intercepts and slopes). The same color is used to indicate observations and model-based estimates for the same patient.
Figure 2
Figure 2
Vertigo data: INLA vs. MCMC approach. Bayesian inference for fixed effects (Poisson random slope model): comparison of samples from a long MCMC chain (□) with the posterior marginals computed with the Laplace approximation (—) obtained by using INLA. The vertical blue line shows the posterior mean.
Figure 3
Figure 3
Vertigo data: PIT histograms for all candidate models. U-shaped histograms indicate under-dispersed predictive distributions, hump or inverse-U shaped histograms point at overdispersion, and skewed histograms occur when central tendencies are biased. Dashed gray lines show the histogram height corresponding to perfect calibration.
Figure 4
Figure 4
Simulation study: Discriminatory power of DIC and LS¯for different scenarios (100 runs per scenario). Data generating process: longitudinal, negative binomial counts with subject-specific intercept (balanced design); modeling strategy: Poisson GLMM with random intercept; number of subjects per group: n = 20,50,100; degree of overdispersion: k = 0.5,1,5,10,20,50. As k, the degree of overdispersion decreases and the negative binomial converges to a Poisson distribution. Hence, DIC and LS¯ decline. Note that the range of DIC increases in the case of a larger sample size.
Figure 5
Figure 5
Simulation study: Variability of mean LS within different simulation scenarios. Variability of mean logarithmic score LS¯(r) for true negative binomial (NB) compared with arcsinh, (zero-inflated) Poisson and zero-inflated NB model (r = 1,…,100 iterations per scenario). Sample size: n = 20,50,100 subjects per group; k = 0.5,1,5,10,20,50 determines the amount of overdispersion. Each row is a different value of k (amount of overdispersion), and each plot shows the mean LS¯(r) for each competing model. The column of panels on the right has the largest sample size n; the top row exhibits the results for highly overdispersed counts (k=0.5).

References

    1. Schulz KF, Altman DG, Moher D. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c332. doi: 10.1136/bmj.c332. - DOI - PMC - PubMed
    1. Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c869. doi: 10.1136/bmj.c869. - DOI - PMC - PubMed
    1. Brandes J, Saper J, Diamond M, Couch J, Lewis D, Schmitt J, Neto W, Schwabe S, Jacobs D. MIGR-002 Study Group. Topiramate for migraine prevention: a randomized controlled trial. JAMA. 2004;291(8):965–973. doi: 10.1001/jama.291.8.965. - DOI - PubMed
    1. Shih WJ, Quan H. Planning and analysis of repeated measures at key time-points in clinical trials sponsored by pharmaceutical companies. Stat Med. 1999;18(8):961–973. doi: 10.1002/(SICI)1097-0258(19990430)18:8<961::AID-SIM83>3.0.CO;2-I. - DOI - PubMed
    1. Galbraith S, Marschner IC. Guidelines for the design of clinical trials with longitudinal outcomes. Controlled Clin Trials. 2002;23(3):257–273. doi: 10.1016/S0197-2456(02)00205-2. - DOI - PubMed

Publication types