Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec;16(4):373-90.
doi: 10.1037/a0025813. Epub 2011 Oct 31.

Fitting multilevel models with ordinal outcomes: performance of alternative specifications and methods of estimation

Affiliations

Fitting multilevel models with ordinal outcomes: performance of alternative specifications and methods of estimation

Daniel J Bauer et al. Psychol Methods. 2011 Dec.

Abstract

Previous research has compared methods of estimation for fitting multilevel models to binary data, but there are reasons to believe that the results will not always generalize to the ordinal case. This article thus evaluates (a) whether and when fitting multilevel linear models to ordinal outcome data is justified and (b) which estimator to employ when instead fitting multilevel cumulative logit models to ordinal data, maximum likelihood (ML), or penalized quasi-likelihood (PQL). ML and PQL are compared across variations in sample size, magnitude of variance components, number of outcome categories, and distribution shape. Fitting a multilevel linear model to ordinal outcomes is shown to be inferior in virtually all circumstances. PQL performance improves markedly with the number of ordinal categories, regardless of distribution shape. In contrast to binary data, PQL often performs as well as ML when used with ordinal data. Further, the performance of PQL is typically superior to ML when the data include a small to moderate number of clusters (i.e., ≤ 50 clusters).

PubMed Disclaimer

Figures

Figure 1
Figure 1
Marginal category distributions used in the simulation study (averaged over predictors and random effects). Notes. Within 3-, 5-, and 7-category outcome conditions, marginal frequencies are held constant but permuted across categories to manipulate the distribution shape (bell-shaped, skewed, or polarized) without changing sparseness. Within 2-category outcome conditions, it is impossible to hold marginal frequencies constant while manipulating shape (balanced or unbalanced).
Figure 2
Figure 2
Average bias for the three fixed effect estimates (excluding thresholds) across estimator, number of outcome categories, and distribution shape. Notes. The normal-theory REML (Restricted Maximum Likelihood) estimator was used when fitting the linear multilevel model. The estimators of PQL (Penalized Quasi-Likelihood) or ML (with adaptive quadrature) were used when fitting the multilevel cumulative logit (logistic) model. Points for two-category conditions are not connected to points for 3-7 category conditions because their distribution shapes do not correspond. Results show that bias is large and sensitive to distribution shape when using the linear model but not when using the cumulative logit model (either estimator). Results are collapsed over the number of clusters, cluster size, and the magnitude of the random effects.
Figure 3
Figure 3
Average bias for the three fixed effect estimates (excluding thresholds) across logistic estimators, number of outcome categories and cluster size. Notes. Logistic estimators were either PQL (Penalized Quasi-Likelihood) or ML (Maximum Likelihood) with adaptive quadrature. Results show that PQL produces somewhat negatively biased fixed effect estimates, particularly when random effects have large variances, whereas the estimates obtained from logistic ML show small, positive bias. In both cases, bias decreases with the number of categories of the outcome. Results are collapsed over number of clusters and distribution shape and do not include linear multilevel model conditions.
Figure 4
Figure 4
Mean-Squared Error (MSE) for the fixed effects (excluding thresholds) across number of outcome categories, number of clusters, and cluster size. Notes. MSE is indicated by the height of the vertical lines, and it is broken into components representing squared bias (portion of the line below the symbol) and sampling variance (portion of the line above the symbol). The scale differs across panels and is discontinuous in the upper right panel. MSE is averaged across the three fixed effects.Results are plotted for multilevel cumulative logit models; PQL denotes Penalized Quasi-Likelihood and ML denotes Maximum Likelihood with adaptive quadrature. This plot does include linear multilevel model conditions.
Figure 5
Figure 5
Average bias for the standard errors (SE) of the three fixed effect estimates (excluding thresholds) across number of outcome categories and cluster size. Notes. Results are plotted for multilevel cumulative logit models; PQL denotes Penalized Quasi-Likelihood and ML denotes Maximum Likelihood with adaptive quadrature. This plot does include linear multilevel model conditions.
Figure 6
Figure 6
Mean-Squared Error (MSE) for the standard deviation of the random intercept, across number of outcome categories, number of clusters, and cluster size. Notes. The scale differs across panels and is discontinuous in the upper right panel. See Figure 4 notes for definition of quantities in this plot.
Figure 7
Figure 7
Mean-Squared Error (MSE) for the standard deviation of the random slope, across number of outcome categories, number of clusters, and cluster size. Notes. The scale differs across panels and is discontinuous in the upper right panel. See Figure 4 notes for definition of quantities in this plot.

References

    1. Agresti A, Booth JG, Hobert JP, Caffo B. Random-effects modeling of categorical response data. Sociological Methodology. 2000;30:27–80. doi: 10.1111/0081-1750.t01-1-00075. - DOI
    1. Akaike H. IEEE Transactions on Automatic Control. Vol. 19. 1974. A new look at the statistical model identification; pp. 716–723. - DOI
    1. Bakeman R. Recommended effect size statistics for repeated measures designs. Behavior Research Methods. 2005;3:379–384. doi: 10.3758/BF03192707. - DOI - PubMed
    1. Bauer DJ. Estimating multilevel linear models as structural equation models. Journal of Educational and Behavioral Statistics. 2003;28:135–167. doi: 10.3102/10769986028002135. - DOI
    1. Bauer DJ. A note on comparing the estimates of models for cluster-correlated or longitudinal data with binary or ordinal outcomes. Psychometrika. 2009;74:97–105. doi: 10.1007/s11336-008-9080-1. - DOI

Publication types