A simulation study of sample size for multilevel logistic regression models

Rahim Moineddin¹, Flora I Matheson, Richard H Glazier

Affiliations

PMID: 17634107
PMCID: PMC1955447
DOI: 10.1186/1471-2288-7-34

A simulation study of sample size for multilevel logistic regression models

Rahim Moineddin et al. BMC Med Res Methodol. 2007.

. 2007 Jul 16:7:34.

doi: 10.1186/1471-2288-7-34.

Authors

Rahim Moineddin¹, Flora I Matheson, Richard H Glazier

Affiliation

¹ Department of Public Health Sciences, University of Toronto, Toronto, Canada. rahim.moineddin@utoronto.ca

PMID: 17634107
PMCID: PMC1955447
DOI: 10.1186/1471-2288-7-34

Abstract

Background: Many studies conducted in health and social sciences collect individual level data as outcome measures. Usually, such data have a hierarchical structure, with patients clustered within physicians, and physicians clustered within practices. Large survey data, including national surveys, have a hierarchical or clustered structure; respondents are naturally clustered in geographical units (e.g., health regions) and may be grouped into smaller units. Outcomes of interest in many fields not only reflect continuous measures, but also binary outcomes such as depression, presence or absence of a disease, and self-reported general health. In the framework of multilevel studies an important problem is calculating an adequate sample size that generates unbiased and accurate estimates.

Methods: In this paper simulation studies are used to assess the effect of varying sample size at both the individual and group level on the accuracy of the estimates of the parameters and variance components of multilevel logistic regression models. In addition, the influence of prevalence of the outcome and the intra-class correlation coefficient (ICC) is examined.

Results: The results show that the estimates of the fixed effect parameters are unbiased for 100 groups with group size of 50 or higher. The estimates of the variance covariance components are slightly biased even with 100 groups and group size of 50. The biases for both fixed and random effects are severe for group size of 5. The standard errors for fixed effect parameters are unbiased while for variance covariance components are underestimated. Results suggest that low prevalent events require larger sample sizes with at least a minimum of 100 groups and 50 individuals per group.

Conclusion: We recommend using a minimum group size of 50 with at least 50 groups to produce valid estimates for multi-level logistic regression models. Group size should be adjusted under conditions where the prevalence of events is low such that the expected number of events in each group should be greater than one.

PubMed Disclaimer

Cited by

Quality and Integrated Service Delivery: A Cross-Sectional Study of the Effects of Malaria and Antenatal Service Quality on Malaria Intervention Use in Sub-Saharan Africa.
Lee EH, Mancuso JD, Koehlmoos T, Stewart VA, Bennett JW, Olsen C. Lee EH, et al. Trop Med Infect Dis. 2022 Nov 9;7(11):363. doi: 10.3390/tropicalmed7110363. Trop Med Infect Dis. 2022. PMID: 36355905 Free PMC article.
Sample Size Requirements for Simple and Complex Mediation Models.
Sim M, Kim SY, Suh Y. Sim M, et al. Educ Psychol Meas. 2022 Feb;82(1):76-106. doi: 10.1177/00131644211003261. Epub 2021 Apr 19. Educ Psychol Meas. 2022. PMID: 34992307 Free PMC article.
Management factors associated with bovine respiratory disease in preweaned calves on California dairies: The BRD 100 study.
Maier GU, Love WJ, Karle BM, Dubrovsky SA, Williams DR, Champagne JD, Anderson RJ, Rowe JD, Lehenbauer TW, Van Eenennaam AL, Aly SS. Maier GU, et al. J Dairy Sci. 2019 Aug;102(8):7288-7305. doi: 10.3168/jds.2018-14773. Epub 2019 Jun 13. J Dairy Sci. 2019. PMID: 31202656 Free PMC article.
Hospital variation in intravenous inotrope use for patients hospitalized with heart failure: insights from Get With The Guidelines.
Allen LA, Fonarow GC, Grau-Sepulveda MV, Hernandez AF, Peterson PN, Partovian C, Li SX, Heidenreich PA, Bhatt DL, Peterson ED, Krumholz HM; American Heart Association’s Get With The Guidelines Heart Failure Investigators. Allen LA, et al. Circ Heart Fail. 2014 Mar 1;7(2):251-60. doi: 10.1161/CIRCHEARTFAILURE.113.000761. Epub 2014 Jan 31. Circ Heart Fail. 2014. PMID: 24488983 Free PMC article. Clinical Trial.
Adequate Sample Sizes for a Three-Level Growth Model.
Lee E, Hong S. Lee E, et al. Front Psychol. 2021 Jul 1;12:685496. doi: 10.3389/fpsyg.2021.685496. eCollection 2021. Front Psychol. 2021. PMID: 34276510 Free PMC article.

See all "Cited by" articles

References

1. Faris REL, Dunham HW. In: Mental disorders in urban areas: an ecological study of schizophrenia and other psychoses, Dunham HW, editor. Chicago, University of Chicago Press; 1939.
1. Shaw CR, McKay HD. In: Juvenile delinquency and urban areas;: a study of rates of delinquency in relation to differential characteristics of local communities in American cities. McKay HD, editor. Chicago, University Press; 1969.
1. O'Campo P. Invited Commentary: Advancing Theory and Methods for Multilevel Models of Residential Neighborhoods and Health. Am J Epidemiol. 2003;157:9–13. doi: 10.1093/aje/kwf171. - DOI - PubMed
1. Maas CJM, Hox JJ. Robustness issues in multilevel regression analysis. Statistica Neerlandica. 2004;58:127–137. doi: 10.1046/j.0039-0402.2003.00252.x. - DOI
1. Hox JJ. Multilevel analysis: techniques and applications. Mahwah, N.J., Lawrence Erlbaum Publishers; 2002.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A simulation study of sample size for multilevel logistic regression models

Affiliation

A simulation study of sample size for multilevel logistic regression models

Authors

Affiliation

Abstract

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources