Poisson, Poisson-gamma and zero-inflated regression models of motor vehicle crashes: balancing statistical fit and theory
- PMID: 15607273
- DOI: 10.1016/j.aap.2004.02.004
Poisson, Poisson-gamma and zero-inflated regression models of motor vehicle crashes: balancing statistical fit and theory
Abstract
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states-perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of "excess" zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to "excess" zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed-and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros.
Similar articles
-
Modeling motor vehicle crashes using Poisson-gamma models: examining the effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter.Accid Anal Prev. 2006 Jul;38(4):751-66. doi: 10.1016/j.aap.2006.02.001. Epub 2006 Mar 20. Accid Anal Prev. 2006. PMID: 16545328
-
Modeling crash outcome probabilities at rural intersections: application of hierarchical binomial logistic models.Accid Anal Prev. 2007 Jan;39(1):125-34. doi: 10.1016/j.aap.2006.06.011. Epub 2006 Aug 22. Accid Anal Prev. 2007. PMID: 16925978
-
On the nature of over-dispersion in motor vehicle crash prediction models.Accid Anal Prev. 2007 May;39(3):459-68. doi: 10.1016/j.aap.2006.08.002. Epub 2006 Dec 8. Accid Anal Prev. 2007. PMID: 17161374
-
Bayesian ranking of sites for engineering safety improvements: decision parameter, treatability concept, statistical criterion, and spatial dependence.Accid Anal Prev. 2005 Jul;37(4):699-720. doi: 10.1016/j.aap.2005.03.012. Epub 2005 Apr 12. Accid Anal Prev. 2005. PMID: 15949462 Review.
-
Further notes on the application of zero-inflated models in highway safety.Accid Anal Prev. 2007 Jan;39(1):53-7. doi: 10.1016/j.aap.2006.06.004. Epub 2006 Sep 1. Accid Anal Prev. 2007. PMID: 16949027 Review.
Cited by
-
Space-Time Analyses of Alcohol Outlets and Related Motor Vehicle Crashes: Associations at City and Census Block-Group Levels.Alcohol Clin Exp Res. 2018 Jun;42(6):1113-1121. doi: 10.1111/acer.13758. Epub 2018 May 20. Alcohol Clin Exp Res. 2018. PMID: 29672873 Free PMC article.
-
State Firearm Laws and Interstate Firearm Deaths From Homicide and Suicide in the United States: A Cross-sectional Analysis of Data by County.JAMA Intern Med. 2018 May 1;178(5):692-700. doi: 10.1001/jamainternmed.2018.0190. JAMA Intern Med. 2018. PMID: 29507953 Free PMC article.
-
Spatiotemporal Analysis of the Association Between Pain Management Clinic Laws and Opioid Prescribing and Overdose Deaths.Am J Epidemiol. 2021 Dec 1;190(12):2592-2603. doi: 10.1093/aje/kwab192. Am J Epidemiol. 2021. PMID: 34216209 Free PMC article.
-
Violent crime redistribution in a city following a substantial increase in the number of off-sale alcohol outlets: A Bayesian analysis.Drug Alcohol Rev. 2018 Mar;37(3):348-355. doi: 10.1111/dar.12636. Epub 2017 Nov 22. Drug Alcohol Rev. 2018. PMID: 29168249 Free PMC article.
-
Proximity to the Southern Border and Sociodemographic Correlates of Drinking and Driving Arrests in California.Alcohol Clin Exp Res. 2020 Oct;44(10):2064-2072. doi: 10.1111/acer.14439. Epub 2020 Sep 20. Alcohol Clin Exp Res. 2020. PMID: 32815565 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials