Transportation Research Interdisciplinary Perspectives (Nov 2020)
Justification for considering zero-inflated models in crash frequency analysis
Abstract
One common challenge of modeling intersection related crash data is the high proportion of sites with zero crashes. Extensive research has been done on appropriate methods to handle excess zeroes. There is some reluctance to use zero-inflated models in the traffic safety literature. The primary purpose of this paper is to evaluate zero-inflated models to determine if they are a suitable method for modeling crash counts. An appropriate approach to model selection is to choose the model that best accomplishes research objectives rather than attempting to discover the true underlying data generating process. Thus using zero-inflated models is warranted when they outperform other models relative to research objectives. In addition, using zero-inflated models does not assume sites are in an inherently safe or unsafe state, and should not be summarily dismissed on the basis of disagreement with the hypothesized underlying data generating process. Secondarily, we compare implementations of zero-inflated Poisson, zero-inflated negative binomial, and negative binomial-Lindley Bayesian hierarchical models using intersection related crash data for the state of Utah from 2014 to 2018. We specifically compare the quality of fit as determined by a Bayesian χ2 test for goodness-of-fit and their relative predictive accuracy. The zero-inflated negative binomial performs best overall. We conclude that there are cases where zero-inflated models perform as well or better than other comparable models and may be considered as a viable option to model crash counts.