Determining the sample size for a cluster-randomised trial using knowledge elicitation: Bayesian hierarchical modelling of the intracluster correlation coefficient

Svetlana V Tishkovskaya¹, Chris J Sutton², Lois H Thomas³, Caroline L Watkins¹

Affiliations

¹ Lancashire Clinical Trials Unit, Faculty of Health and Care, University of Central Lancashire, Preston, UK.
² Centre for Biostatistics, Division of Population Health, Health Services Research & Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK.
³ Faculty of Allied Health and Wellbeing, University of Central Lancashire, Preston, UK.

PMID: 37036110
PMCID: PMC10262340
DOI: 10.1177/17407745231164569

Randomized Controlled Trial

Determining the sample size for a cluster-randomised trial using knowledge elicitation: Bayesian hierarchical modelling of the intracluster correlation coefficient

Svetlana V Tishkovskaya et al. Clin Trials. 2023 Jun.

. 2023 Jun;20(3):293-306.

doi: 10.1177/17407745231164569. Epub 2023 Apr 10.

Authors

Svetlana V Tishkovskaya¹, Chris J Sutton², Lois H Thomas³, Caroline L Watkins¹

Affiliations

¹ Lancashire Clinical Trials Unit, Faculty of Health and Care, University of Central Lancashire, Preston, UK.
² Centre for Biostatistics, Division of Population Health, Health Services Research & Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK.
³ Faculty of Allied Health and Wellbeing, University of Central Lancashire, Preston, UK.

PMID: 37036110
PMCID: PMC10262340
DOI: 10.1177/17407745231164569

Abstract

Background: The intracluster correlation coefficient is a key input parameter for sample size determination in cluster-randomised trials. Sample size is very sensitive to small differences in the intracluster correlation coefficient, so it is vital to have a robust intracluster correlation coefficient estimate. This is often problematic because either a relevant intracluster correlation coefficient estimate is not available or the available estimate is imprecise due to being based on small-scale studies with low numbers of clusters. Misspecification may lead to an underpowered or inefficiently large and potentially unethical trial.

Methods: We apply a Bayesian approach to produce an intracluster correlation coefficient estimate and hence propose sample size for a planned cluster-randomised trial of the effectiveness of a systematic voiding programme for post-stroke incontinence. A Bayesian hierarchical model is used to combine intracluster correlation coefficient estimates from other relevant trials making use of the wealth of intracluster correlation coefficient information available in published research. We employ knowledge elicitation process to assess the relevance of each intracluster correlation coefficient estimate to the planned trial setting. The team of expert reviewers assigned relevance weights to each study, and each outcome within the study, hence informing parameters of Bayesian modelling. To measure the performance of experts, agreement and reliability methods were applied.

Results: The 34 intracluster correlation coefficient estimates extracted from 16 previously published trials were combined in the Bayesian hierarchical model using aggregated relevance weights elicited from the experts. The intracluster correlation coefficients available from external sources were used to construct a posterior distribution of the targeted intracluster correlation coefficient which was summarised as a posterior median with a 95% credible interval informing researchers about the range of plausible sample size values. The estimated intracluster correlation coefficient determined a sample size of between 450 (25 clusters) and 480 (20 clusters), compared to 500-600 from a classical approach. The use of quantiles, and other parameters, from the estimated posterior distribution is illustrated and the impact on sample size described.

Conclusion: Accounting for uncertainty in an unknown intracluster correlation coefficient, trials can be designed with a more robust sample size. The approach presented provides the possibility of incorporating intracluster correlation coefficients from various cluster-randomised trial settings which can differ from the planned study, with the difference being accounted for in the modelling. By using expert knowledge to elicit relevance weights and synthesising the externally available intracluster correlation coefficient estimates, information is used more efficiently than in a classical approach, where the intracluster correlation coefficient estimates tend to be less robust and overly conservative. The intracluster correlation coefficient estimate constructed is likely to produce a smaller sample size on average than the conventional strategy of choosing a conservative intracluster correlation coefficient estimate. This may therefore result in substantial time and resources savings.

Keywords: Bayesian hierarchical model; cluster-randomised trial; intracluster correlation coefficient; knowledge elicitation; post-stroke incontinence; sample size determination.

PubMed Disclaimer

Conflict of interest statement

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Figures

**Figure 1.**
Boxplots showing the spread of reviewer responses about each study’s weight. Study numbers are in same order as in Table 1. The narrowest boxplot relates to the ICONS feasibility trial.

**Figure 2.**
ICC estimates included in the modelling plotted together with 95% confidence intervals and average study and outcome weights. Box sizes are inversely proportional to variances. The studies are ordered by decreasing relevance to the planned study, based on estimated average study weight. The largest weights were from the ICONS feasibility trial.

**Figure 3.**
Range of sample sizes derived for different ICC values from posterior interquartile range of ICC estimate for the varying number of clusters at fixed levels (k from 10 to 60), for cluster-randomised trial with k equal size clusters per arm. The bullets are sample sizes calculated using posterior median ICC. The whiskers correspond to 25% and 75% posterior ICC quantiles. Median, mean and weighted mean columns show sample sizes calculated using a classical multi-estimate method. All numbers correspond to at least 80% power achieved; *: 80% power is not achievable for ICC 75% quantile; **: sample size corresponding to ICC 75% quantile is 1440; na: 80% power is not achievable for this number of clusters.

See this image and copyright information in PMC

References

1. Eldridge S, Kerry S. A practical guide to cluster randomised trials in health services research. New York: Wiley, 2012.
1. Killip S, Mahfoud Z, Pearce K. What is an intracluster correlation coefficient? Crucial concepts for primary care researchers. Ann Fam Med 2004; 2(3): 204–208. - PMC - PubMed
1. Kerry SM, Bland JM. The intracluster correlation coefficient in cluster randomisation. BMJ 1998; 316: 1455–1460. - PMC - PubMed
1. Rutterford C, Copas A, Eldridge S. Methods for sample size determination in cluster randomized trials. Int J Epidemiol 2015; 44(3): 1051–1067. - PMC - PubMed
1. Campbell MJ, Donner A, Klar N. Developments in cluster randomized trials and statistics in medicine. Stat Med 2007; 26: 2–19. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Determining the sample size for a cluster-randomised trial using knowledge elicitation: Bayesian hierarchical modelling of the intracluster correlation coefficient

Affiliations

Determining the sample size for a cluster-randomised trial using knowledge elicitation: Bayesian hierarchical modelling of the intracluster correlation coefficient

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources