Justification for considering zero-inflated models in crash frequency analysis

Timo Pew; Richard L. Warr; Grant G. Schultz; Matthew Heaton

Transportation Research Interdisciplinary Perspectives (Nov 2020)

Justification for considering zero-inflated models in crash frequency analysis

Timo Pew,
Richard L. Warr,
Grant G. Schultz,
Matthew Heaton

Affiliations

Timo Pew: Department of Statistics, Brigham Young University, 2152 WVB, Brigham Young University, Provo, UT 84602, United States
Richard L. Warr: Department of Statistics, Brigham Young University, 2152 WVB, Brigham Young University, Provo, UT 84602, United States; Corresponding author.
Grant G. Schultz: Department of Civil & Environmental Engineering, Brigham Young University, 430 EB, Provo, UT 84602, United States
Matthew Heaton: Department of Statistics, Brigham Young University, 2152 WVB, Brigham Young University, Provo, UT 84602, United States

Journal volume & issue: Vol. 8
p. 100249

Abstract

Read online

One common challenge of modeling intersection related crash data is the high proportion of sites with zero crashes. Extensive research has been done on appropriate methods to handle excess zeroes. There is some reluctance to use zero-inflated models in the traffic safety literature. The primary purpose of this paper is to evaluate zero-inflated models to determine if they are a suitable method for modeling crash counts. An appropriate approach to model selection is to choose the model that best accomplishes research objectives rather than attempting to discover the true underlying data generating process. Thus using zero-inflated models is warranted when they outperform other models relative to research objectives. In addition, using zero-inflated models does not assume sites are in an inherently safe or unsafe state, and should not be summarily dismissed on the basis of disagreement with the hypothesized underlying data generating process. Secondarily, we compare implementations of zero-inflated Poisson, zero-inflated negative binomial, and negative binomial-Lindley Bayesian hierarchical models using intersection related crash data for the state of Utah from 2014 to 2018. We specifically compare the quality of fit as determined by a Bayesian χ2 test for goodness-of-fit and their relative predictive accuracy. The zero-inflated negative binomial performs best overall. We conclude that there are cases where zero-inflated models perform as well or better than other comparable models and may be considered as a viable option to model crash counts.

Published in Transportation Research Interdisciplinary Perspectives

ISSN: 2590-1982 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Social Sciences: Transportation and communications
Website: https://www.journals.elsevier.com/transportation-research-interdisciplinary-perspectives

About the journal

Abstract

Keywords