Frontiers in Built Environment (Apr 2022)
Using Machine Learning Models to Forecast Severity Level of Traffic Crashes by R Studio and ArcGIS
Abstract
This study describes crash causes, conditions, and distribution of accident hot spots along with an analysis of the risk factors that significantly affect severity levels of crashes and their effects on pedestrian safety using machine learning (ML) techniques. Supervised ML algorithm–random forest and decision tree–based algorithm-AdaBoost algorithms are applied and compared to predict the severity level and future crashes based on road crash elements. Association rule, an unsupervised learning algorithm, is utilized to understand the association between driver characteristics, geometric elements of the highway, the environment, time, weather, and speed. Slight, medium, and severe injuries and fatalities in crashes are also considered to understand the behavior of road drivers, who are most likely to cause crashes. Fatalities and injuries are studied with spatial statistics analysis. The variables most affecting the severity of the crash are determined and discussed in detail. The results are checked for accuracy, sensitivity, specificity, recall, precision, and F1 score performance. The impact of drivers, vehicles, and road characteristics is investigated in traffic crashes. The random forest model was found to be the most suitable algorithm to predict crash severity levels.
Keywords