Information (May 2020)

Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest

  • Mu-Ming Chen,
  • Mu-Chen Chen

DOI
https://doi.org/10.3390/info11050270
Journal volume & issue
Vol. 11, no. 5
p. 270

Abstract

Read online

To reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR), classification and regression tree (CART), and random forest (RF), to compare their prediction capability, identify the significant variables (identified by LR) and important variables (identified by CART or RF) that are strongly correlated with road accident severity, and distinguish the variables that have significant positive influence on prediction performance. In this study, three prediction performance evaluation measures, accuracy, sensitivity and specificity, are used to find the best integrated method which consists of the most effective prediction model and the input variables that have higher positive influence on accuracy, sensitivity and specificity.

Keywords