Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study
- PMID: 25343620
- PMCID: PMC4208783
- DOI: 10.1371/journal.pone.0110257
Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study
Erratum in
-
Correction: Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study.PLoS One. 2016 May 24;11(5):e0156508. doi: 10.1371/journal.pone.0156508. eCollection 2016. PLoS One. 2016. PMID: 27218254 Free PMC article.
Abstract
Background: When unaccounted-for group-level characteristics affect an outcome variable, traditional linear regression is inefficient and can be biased. The random- and fixed-effects estimators (RE and FE, respectively) are two competing methods that address these problems. While each estimator controls for otherwise unaccounted-for effects, the two estimators require different assumptions. Health researchers tend to favor RE estimation, while researchers from some other disciplines tend to favor FE estimation. In addition to RE and FE, an alternative method called within-between (WB) was suggested by Mundlak in 1978, although is utilized infrequently.
Methods: We conduct a simulation study to compare RE, FE, and WB estimation across 16,200 scenarios. The scenarios vary in the number of groups, the size of the groups, within-group variation, goodness-of-fit of the model, and the degree to which the model is correctly specified. Estimator preference is determined by lowest mean squared error of the estimated marginal effect and root mean squared error of fitted values.
Results: Although there are scenarios when each estimator is most appropriate, the cases in which traditional RE estimation is preferred are less common. In finite samples, the WB approach outperforms both traditional estimators. The Hausman test guides the practitioner to the estimator with the smallest absolute error only 61% of the time, and in many sample sizes simply applying the WB approach produces smaller absolute errors than following the suggestion of the test.
Conclusions: Specification and estimation should be carefully considered and ultimately guided by the objective of the analysis and characteristics of the data. The WB approach has been underutilized, particularly for inference on marginal effects in small samples. Blindly applying any estimator can lead to bias, inefficiency, and flawed inference.
Conflict of interest statement
Figures











References
-
- Kennedy P (2003) A Guide to Econometrics. 5th ed. Cambridge: The MIT Press. 500 p.
-
- Schempf AH, Kaufman JS (2012) Accounting for context in studies of health inequalities: a review and comparison of analytic approaches. Ann Epidemiol 22: 683–690. - PubMed
-
- Duncan C, Jones K, Moon G (1998) Context, Composition and Heterogeneity: Using Multilevel Models in Health Research. Soc Sci Med 46 (1): 97–117. - PubMed
-
- Bingenheimer JB, Raudenbush SW (2004) Statistical and Substantive Inferences in Public Health: Issues in the Application of Multilevel Models. Annu Rev Public Health 25: 53–77. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources