Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 9;9(1):e84601.
doi: 10.1371/journal.pone.0084601. eCollection 2014.

Generalized linear mixed models for binary data: are matching results from penalized quasi-likelihood and numerical integration less biased?

Affiliations

Generalized linear mixed models for binary data: are matching results from penalized quasi-likelihood and numerical integration less biased?

Andrea Benedetti et al. PLoS One. .

Abstract

Background: Over time, adaptive Gaussian Hermite quadrature (QUAD) has become the preferred method for estimating generalized linear mixed models with binary outcomes. However, penalized quasi-likelihood (PQL) is still used frequently. In this work, we systematically evaluated whether matching results from PQL and QUAD indicate less bias in estimated regression coefficients and variance parameters via simulation.

Methods: We performed a simulation study in which we varied the size of the data set, probability of the outcome, variance of the random effect, number of clusters and number of subjects per cluster, etc. We estimated bias in the regression coefficients, odds ratios and variance parameters as estimated via PQL and QUAD. We ascertained if similarity of estimated regression coefficients, odds ratios and variance parameters predicted less bias.

Results: Overall, we found that the absolute percent bias of the odds ratio estimated via PQL or QUAD increased as the PQL- and QUAD-estimated odds ratios became more discrepant, though results varied markedly depending on the characteristics of the dataset.

Conclusions: Given how markedly results varied depending on data set characteristics, specifying a rule above which indicated biased results proved impossible. This work suggests that comparing results from generalized linear mixed models estimated via PQL and QUAD is a worthwhile exercise for regression coefficients and variance components obtained via QUAD, in situations where PQL is known to give reasonable results.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Boxplot depicting the slopes from separate simple linear regressions for the effect of the absolute percent difference in ORPQL and ORQUAD on the absolute percent bias in ORQUAD or ORPQL, respectively, overall and by data generation parameters.
Median (interquartile range) of the estimated slope is the center of the box, box edges are the 25th and 75th percentile respectively, ends of the dashed lines are the 10th and 90th percentile, respectively.
Figure 2
Figure 2. Barplot depicting the proportion of scenarios in which the effect of the absolute percent difference in ORPQL and ORQUAD was a statistically significant predictor of the absolute percent bias in ORQUAD or ORPQL, respectively from separate simple linear regressions, overall and by data generation parameters.
Figure 3
Figure 3. Boxplot depicting the R2 from separate simple linear regressions for the effect of the absolute percent difference in ORPQL and ORQUAD on the absolute percent bias in ORQUAD or ORPQL, respectively, overall and by data generation parameters.
Median (interquartile range) of the R2 is the center of the box, box edges are the 25th and 75th percentile respectively, ends of the dashed lines are the 10th and 90th percentile, respectively.
Figure 4
Figure 4. Boxplot depicting the slopes from separate simple linear regressions for the effect of the absolute percent difference in σPQL and σQUAD on the absolute percent bias in σQUAD or σPQL, respectively, overall and by data generation parameters.
Median (interquartile range) of the estimated slope is the center of the box, box edges are the 25th and 75th percentile respectively, ends of the dashed lines are the 10th and 90th percentile, respectively.
Figure 5
Figure 5. Barplot depicting the proportion of scenarios in which the effect of the absolute percent difference in σPQL and σQUAD was a statistically significant predictor on the absolute percent bias in σQUAD or σPQL, respectively from separate simple linear regressions, overall and by data generation parameters.
Figure 6
Figure 6. Boxplot depicting the R2 from separate simple linear regressions for the effect of the absolute percent difference in σPQL and σQUAD on the absolute percent bias in σQUAD or σPQL, respectively, overall and by data generation parameters.
Median (interquartile range) of the R2 is the center of the box, box edges are the 25th and 75th percentile respectively, ends of the dashed lines are the 10th and 90th percentile, respectively.

Similar articles

Cited by

References

    1. Molenberghs G, Verbeke G (2005) Models for Discrete Longitudinal Data. New York: Springer.
    1. Diggle P, Heagerty P, Liang K-Y, Zeger SL (2002) Analysis of Longitudinal Data. Oxford: Oxford University Press.
    1. Jang JY, Kang SK, Chung HK (1993) Biological exposure indices of organic solvents for Korean workers. International Archives of Occupational & Environmental Health 65: S219–S222 15. - PubMed
    1. Neuhaus JM, Kalbfleisch JD, Hauck WW (1991) A comparison of cluster-specific and population average approaches for analyzing correlated binary data. International Statistical Review 59: 25–35.
    1. Breslow N, Clayton D (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88: 9–25.