Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test
- PMID: 32134502
- DOI: 10.1111/biom.13249
Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test
Abstract
Evaluating the goodness of fit of logistic regression models is crucial to ensure the accuracy of the estimated probabilities. Unfortunately, such evaluation is problematic in large samples. Because the power of traditional goodness of fit tests increases with the sample size, practically irrelevant discrepancies between estimated and true probabilities are increasingly likely to cause the rejection of the hypothesis of perfect fit in larger and larger samples. This phenomenon has been widely documented for popular goodness of fit tests, such as the Hosmer-Lemeshow test. To address this limitation, we propose a modification of the Hosmer-Lemeshow approach. By standardizing the noncentrality parameter that characterizes the alternative distribution of the Hosmer-Lemeshow statistic, we introduce a parameter that measures the goodness of fit of a model but does not depend on the sample size. We provide the methodology to estimate this parameter and construct confidence intervals for it. Finally, we propose a formal statistical test to rigorously assess whether the fit of a model, albeit not perfect, is acceptable for practical purposes. The proposed method is compared in a simulation study with a competing modification of the Hosmer-Lemeshow test, based on repeated subsampling. We provide a step-by-step illustration of our method using a model for postneonatal mortality developed in a large cohort of more than 300 000 observations.
Keywords: Hosmer-Lemeshow test; calibration; goodness of fit; large samples; logistic regression; noncentrality parameter.
© 2020 The International Biometric Society.
Comment in
-
Discussion of "Assessing the goodness-of-fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test," by Giovanni Nattino, Michael L. Pennell, and Stanley Lemeshow.Biometrics. 2020 Jun;76(2):569-571. doi: 10.1111/biom.13255. Epub 2020 Apr 6. Biometrics. 2020. PMID: 32251523 No abstract available.
-
Discussion on "Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test" by Giovanni Nattino, Michael L. Pennell, and Stanley Lemeshow.Biometrics. 2020 Jun;76(2):572-574. doi: 10.1111/biom.13248. Epub 2020 Apr 6. Biometrics. 2020. PMID: 32251529 Free PMC article. No abstract available.
-
Discussion on "Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test" by Giovanni Nattino, Michael L. Pennell, and Stanley Lemeshow.Biometrics. 2020 Jun;76(2):561-563. doi: 10.1111/biom.13257. Epub 2020 Apr 6. Biometrics. 2020. PMID: 32251532 No abstract available.
-
Rejoinder to "Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test".Biometrics. 2020 Jun;76(2):575-577. doi: 10.1111/biom.13250. Epub 2020 Apr 6. Biometrics. 2020. PMID: 32251533 No abstract available.
-
Discussion on "Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test" by Giovanni Nattino, Michael L. Pennell, and Stanley Lemeshow.Biometrics. 2020 Jun;76(2):564-568. doi: 10.1111/biom.13251. Epub 2020 Apr 6. Biometrics. 2020. PMID: 32251538 No abstract available.
References
REFERENCES
-
- Archer, K.J., Lemeshow, S. and Hosmer, D.W. (2007) Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design. Computational Statistics & Data Analysis, 51(9), 4450-4464.
-
- Austin, P.C. and Steyerberg, E.W. (2014) Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers. Statistics in Medicine, 33(3), 517-535.
-
- Browne, M.W. and Cudeck, R. (1992) Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.
-
- Casella, G. and Berger, R.L. (2002) Statistical Inference. Pacific Grove, CA: Thomson Learning.
-
- Dahiya, R.C. and Gurland, J. (1973) How many classes in the Pearson chi-square test? Journal of the American Statistical Association, 68(343), 707-712.
MeSH terms
LinkOut - more resources
Full Text Sources
