Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 4;22(1):211.
doi: 10.1186/s12874-022-01685-8.

A comparison of statistical methods for modeling count data with an application to hospital length of stay

Affiliations

A comparison of statistical methods for modeling count data with an application to hospital length of stay

Gustavo A Fernandez et al. BMC Med Res Methodol. .

Abstract

Background: Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor for adverse events. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data can be treated as count data, with discrete and non-negative values, typically right skewed, and often exhibiting excessive zeros. In this study, we compared the performance of the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regression models using simulated and empirical data.

Methods: Data were generated under different simulation scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. Analysis of hospital LOS was conducted using empirical data from the Medical Information Mart for Intensive Care database.

Results: Results showed that Poisson and ZIP models performed poorly in overdispersed data. ZIP outperformed the rest of the regression models when the overdispersion is due to zero-inflation only. NB and ZINB regression models faced substantial convergence issues when incorrectly used to model equidispersed data. NB model provided the best fit in overdispersed data and outperformed the ZINB model in many simulation scenarios with combinations of zero-inflation and overdispersion, regardless of the sample size. In the empirical data analysis, we demonstrated that fitting incorrect models to overdispersed data leaded to incorrect regression coefficients estimates and overstated significance of some of the predictors.

Conclusions: Based on this study, we recommend to the researchers that they consider the ZIP models for count data with zero-inflation only and NB models for overdispersed data or data with combinations of zero-inflation and overdispersion. If the researcher believes there are two different data generating mechanisms producing zeros, then the ZINB regression model may provide greater flexibility when modeling the zero-inflation and overdispersion.

Keywords: Count data; Negative binomial regression; Poisson regression; Simulation study; Zero-inflated Poisson regression; Zero-inflated negative binomial regression.

PubMed Disclaimer

Conflict of interest statement

None.

Figures

Fig. 1
Fig. 1
Histogram of hospital length of stay for patients with asthma diagnosis, n = 2,167

Similar articles

Cited by

References

    1. Thomas JW, Guire KE, Horvat GG. Is patient length of stay related to quality of care? United States. 1997;42:489–507. - PubMed
    1. Taheri PA, Butz DA, Greenfield LJ. Length of stay has minimal impact on the cost of hospital admission. United States. 2000;191:123–130. - PubMed
    1. Kossovsky MP, Sarasin FP, Chopard P, Louis-Simonet M, Sigaud P, Perneger TV, et al. Relationship between hospital length of stay and quality of care in patients with congestive heart failure. England. 2002;11:219–223. - PMC - PubMed
    1. Khalifa M. Reducing Length of Stay by Enhancing Patients’ Discharge: A Practical Approach to Improve Hospital Efficiency. Netherlands. 2017;238:157–160. - PubMed
    1. Baek H, Cho M, Kim S, Hwang H, Song M, Yoo S. Analysis of length of hospital stay using electronic health records: A statistical and data mining approach. PLoS One. 2018;13(4):e0195901. Available from: 10.1371/journal.pone.0195901. - PMC - PubMed