Safety (Nov 2023)
Explainable Boosting Machine: A Contemporary Glass-Box Model to Analyze Work Zone-Related Road Traffic Crashes
Abstract
Examining the factors contributing to work zone crashes and implementing measures to reduce their occurrence can significantly improve road safety. In this research, we utilized the explainable boosting machine (EBM), a modern glass-box machine learning (ML) model, to categorize and predict work zone-related crashes and to interpret the various contributing factors. The issue of data imbalance was also addressed by utilizing work zone crash data from the state of New Jersey, comprising data collected over the course of two years (2017 and 2018) and applying data augmentation strategies such synthetic minority over-sampling technique (SMOTE), borderline-SMOTE, and SVM-SMOTE. The EBM model was trained using augmented data and Bayesian optimization for hyperparameter tuning. The performance of the EBM model was evaluated and compared to black-box ML models such as combined kernel and tree boosting (KTBoost, python 3.7.1 and KTboost package version 0.2.2), light gradient boosting machine (LightGBM version 3.2.1), and extreme gradient boosting (XGBoost version 1.7.6). The EBM model, using borderline-SMOTE-treated data, demonstrated greater efficacy with respect to precision (81.37%), recall (82.53%), geometric mean (75.39%), and Matthews correlation coefficient (0.43). The EBM model also allows for an in-depth evaluation of single and pairwise factor interactions in predicting work zone-related crash severity. It examines both global and local perspectives, and assists in assessing the influence of various factors.
Keywords