A comparison of zero-inflated and hurdle models for modeling zero-inflated count data
- PMID: 34760432
- PMCID: PMC8570364
- DOI: 10.1186/s40488-021-00121-4
A comparison of zero-inflated and hurdle models for modeling zero-inflated count data
Abstract
Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.
Keywords: Hurdle model; Model diagnosis; Zero deflation; Zero inflation.
© The Author(s) 2021.
Conflict of interest statement
Competing interestsThe author declare that they have no competing interests.
Figures








Similar articles
-
Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana.PLoS One. 2024 Jul 12;19(7):e0304681. doi: 10.1371/journal.pone.0304681. eCollection 2024. PLoS One. 2024. PMID: 38995915 Free PMC article.
-
Marginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros.Stat Med. 2014 Nov 10;33(25):4402-19. doi: 10.1002/sim.6237. Epub 2014 Jun 23. Stat Med. 2014. PMID: 24957791
-
Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.Nicotine Tob Res. 2018 Apr 18;22(8):1390-8. doi: 10.1093/ntr/nty072. Online ahead of print. Nicotine Tob Res. 2018. PMID: 29912423 Free PMC article.
-
The selection of statistical models for reporting count outcomes and intervention effects in brief alcohol intervention trials: A review and recommendations.Alcohol Clin Exp Res (Hoboken). 2024 Jan;48(1):16-28. doi: 10.1111/acer.15232. Epub 2023 Dec 6. Alcohol Clin Exp Res (Hoboken). 2024. PMID: 38054529 Free PMC article. Review.
-
[Change in the use of data for research-A hurdle race for medicine?].Ophthalmologie. 2025 Apr;122(4):286-287. doi: 10.1007/s00347-024-02161-y. Epub 2025 Jan 7. Ophthalmologie. 2025. PMID: 39775876 Review. German. No abstract available.
Cited by
-
Assessment of Neonatal Mortality and Associated Hospital-Related Factors in Healthcare Facilities Within Sunyani and Sunyani West Municipal Assemblies in Bono Region, Ghana.Health Serv Insights. 2024 Jun 11;17:11786329241258836. doi: 10.1177/11786329241258836. eCollection 2024. Health Serv Insights. 2024. PMID: 38873401 Free PMC article.
-
Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies-a value-based biostatistics practice.PeerJ. 2025 May 26;13:e19504. doi: 10.7717/peerj.19504. eCollection 2025. PeerJ. 2025. PMID: 40444286 Free PMC article.
-
Western corn rootworm adult activity and immigrant resistance to Bt traits in first-year maize.PLoS One. 2025 Jun 13;20(6):e0325388. doi: 10.1371/journal.pone.0325388. eCollection 2025. PLoS One. 2025. PMID: 40512719 Free PMC article.
-
Hybrid Machine Learning Approach to Zero-Inflated Data Improves Accuracy of Dengue Prediction.PLoS Negl Trop Dis. 2024 Oct 21;18(10):e0012599. doi: 10.1371/journal.pntd.0012599. eCollection 2024 Oct. PLoS Negl Trop Dis. 2024. PMID: 39432557 Free PMC article.
-
Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana.PLoS One. 2024 Jul 12;19(7):e0304681. doi: 10.1371/journal.pone.0304681. eCollection 2024. PLoS One. 2024. PMID: 38995915 Free PMC article.
References
-
- Agarwal D. K., Gelfand A. E., Citron-Pousty S. Zero-inflated models with application to spatial count data. Environ. Ecol. Stat. 2002;9:341–355. doi: 10.1023/A:1020910605990. - DOI
-
- Lovric M., editor. Akaike’s Information Criterion. Berlin: Springer; 2011.
-
- Akaike H., Petrov B. N., Csaki F. Second international symposium on information theory. Budapest: Akadémiai Kiadó; 1973.
-
- Austin P. C. Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research. Commun. Stat. Simul. Comput. 2009;38(6):1228–1234. doi: 10.1080/03610910902859574. - DOI
Publication types
LinkOut - more resources
Full Text Sources