A comparison of statistical methods for modeling count data with an application to hospital length of stay
- PMID: 35927612
- PMCID: PMC9351158
- DOI: 10.1186/s12874-022-01685-8
A comparison of statistical methods for modeling count data with an application to hospital length of stay
Abstract
Background: Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor for adverse events. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data can be treated as count data, with discrete and non-negative values, typically right skewed, and often exhibiting excessive zeros. In this study, we compared the performance of the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regression models using simulated and empirical data.
Methods: Data were generated under different simulation scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. Analysis of hospital LOS was conducted using empirical data from the Medical Information Mart for Intensive Care database.
Results: Results showed that Poisson and ZIP models performed poorly in overdispersed data. ZIP outperformed the rest of the regression models when the overdispersion is due to zero-inflation only. NB and ZINB regression models faced substantial convergence issues when incorrectly used to model equidispersed data. NB model provided the best fit in overdispersed data and outperformed the ZINB model in many simulation scenarios with combinations of zero-inflation and overdispersion, regardless of the sample size. In the empirical data analysis, we demonstrated that fitting incorrect models to overdispersed data leaded to incorrect regression coefficients estimates and overstated significance of some of the predictors.
Conclusions: Based on this study, we recommend to the researchers that they consider the ZIP models for count data with zero-inflation only and NB models for overdispersed data or data with combinations of zero-inflation and overdispersion. If the researcher believes there are two different data generating mechanisms producing zeros, then the ZINB regression model may provide greater flexibility when modeling the zero-inflation and overdispersion.
Keywords: Count data; Negative binomial regression; Poisson regression; Simulation study; Zero-inflated Poisson regression; Zero-inflated negative binomial regression.
© 2022. The Author(s).
Conflict of interest statement
None.
Figures
Similar articles
-
On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.Stat Med. 2015 Oct 30;34(24):3235-45. doi: 10.1002/sim.6560. Epub 2015 Jun 15. Stat Med. 2015. PMID: 26078035 Free PMC article.
-
Multilevel modeling in single-case studies with zero-inflated and overdispersed count data.Behav Res Methods. 2024 Apr;56(4):2765-2781. doi: 10.3758/s13428-024-02359-7. Epub 2024 Feb 21. Behav Res Methods. 2024. PMID: 38383801
-
Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.Nicotine Tob Res. 2018 Apr 18;22(8):1390-8. doi: 10.1093/ntr/nty072. Online ahead of print. Nicotine Tob Res. 2018. PMID: 29912423 Free PMC article.
-
A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.J Stat Distrib Appl. 2021;8(1):8. doi: 10.1186/s40488-021-00121-4. Epub 2021 Jun 24. J Stat Distrib Appl. 2021. PMID: 34760432 Free PMC article. Review.
-
The selection of statistical models for reporting count outcomes and intervention effects in brief alcohol intervention trials: A review and recommendations.Alcohol Clin Exp Res (Hoboken). 2024 Jan;48(1):16-28. doi: 10.1111/acer.15232. Epub 2023 Dec 6. Alcohol Clin Exp Res (Hoboken). 2024. PMID: 38054529 Free PMC article. Review.
Cited by
-
Exploring the relationship between blood platelet and other components utilizing count regression: A cross-sectional study in Bangladesh.Health Sci Rep. 2024 Aug 20;7(8):e70007. doi: 10.1002/hsr2.70007. eCollection 2024 Aug. Health Sci Rep. 2024. PMID: 39170887 Free PMC article.
-
Optimal Initial Intravenous Loop Diuretic Dosing in Acute Decompensated Heart Failure.JACC Adv. 2024 Sep 7;3(10):101250. doi: 10.1016/j.jacadv.2024.101250. eCollection 2024 Oct. JACC Adv. 2024. PMID: 39290819 Free PMC article.
-
Effectiveness of tramadol-including multimodal analgesia in spinal surgery: a single-center, retrospective cohort study.J Pharm Health Care Sci. 2024 Sep 19;10(1):58. doi: 10.1186/s40780-024-00381-7. J Pharm Health Care Sci. 2024. PMID: 39300518 Free PMC article.
-
Does a take-home dose program result in better patient adherence to methadone? Evidence from Vietnam.Harm Reduct J. 2025 Jul 28;22(1):131. doi: 10.1186/s12954-025-01279-9. Harm Reduct J. 2025. PMID: 40722082 Free PMC article.
-
Relationship between Household Tuberculosis and Socioeconomic and Bioenvironmental Factors: A Statistical Model Approach Using NFHS-5 Data.Indian J Community Med. 2025 Jul-Aug;50(4):689-693. doi: 10.4103/ijcm.ijcm_191_24. Epub 2025 Feb 21. Indian J Community Med. 2025. PMID: 40837164 Free PMC article.
References
-
- Thomas JW, Guire KE, Horvat GG. Is patient length of stay related to quality of care? United States. 1997;42:489–507. - PubMed
-
- Taheri PA, Butz DA, Greenfield LJ. Length of stay has minimal impact on the cost of hospital admission. United States. 2000;191:123–130. - PubMed
-
- Khalifa M. Reducing Length of Stay by Enhancing Patients’ Discharge: A Practical Approach to Improve Hospital Efficiency. Netherlands. 2017;238:157–160. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical