Enhancing prediction and analysis of UK road traffic accident severity using AI: Integration of machine learning, econometric techniques, and time series forecasting in public health research

Md Abu Sufian; Jayasree Varadarajan; Mingbo Niu

Heliyon (Apr 2024)

Enhancing prediction and analysis of UK road traffic accident severity using AI: Integration of machine learning, econometric techniques, and time series forecasting in public health research

Md Abu Sufian,
Jayasree Varadarajan,
Mingbo Niu

Affiliations

Md Abu Sufian: IVR Low-Carbon Research Institute, Chang'an University, Shaanxi, 710018, China; School of Computing and Mathematical Sciences, University of Leicester, UK
Jayasree Varadarajan: School of Geography, University of Leicester, UK
Mingbo Niu: IVR Low-Carbon Research Institute, Chang'an University, Shaanxi, 710018, China; Corresponding author.

Journal volume & issue: Vol. 10, no. 7
p. e28547

Abstract

Read online

This research project explored into the intricacies of road traffic accidents severity in the UK, employing a potent combination of machine learning algorithms, econometric techniques, and traditional statistical methods to analyse longitudinal historical data. Our robust analysis framework includes descriptive, inferential, bivariate, multivariate methodologies, correlation analysis: Pearson's and Spearman's Rank Correlation Coefficient, multiple logistic regression models, Multicollinearity Assessment, and Model Validation. In addressing heteroscedasticity or autocorrelation in error terms, we've advanced the precision and reliability of our regression analyses using the Generalized Method of Moments (GMM). Additionally, our application of the Vector Autoregressive (VAR) model and the Autoregressive Integrated Moving Average (ARIMA) models have enabled accurate time series forecasting. With this approach, we've achieved superior predictive accuracy and marked by a Mean Absolute Scaled Error (MASE) of 0.800 and a Mean Error (ME) of -73.80 compared to a naive forecast. The project further extends its machine learning application by creating a random forest classifier model with a precision of 73%, a recall of 78%, and an F1-score of 73%. Building on this, we employed the H2O AutoML process to optimize our model selection, resulting in an XGBoost model that exhibits exceptional predictive power as evidenced by an RMSE of 0.1761205782994506 and MAE of 0.0874235576229789. Factor Analysis was leveraged to identify underlying variables or factors that explain the pattern of correlations within a set of observed variables. Scoring history, a tool to observe the model's performance throughout the training process was incorporated to ensure the highest possible performance of our machine learning models. We also incorporated Explainable AI (XAI) techniques, utilizing the SHAP (Shapley Additive Explanations) model to comprehend the contributing factors to accident severity. Features such as Driver_Home_Area_Type, Longitude, Driver_IMD_Decile, Road_Type, Casualty_Home_Area_Type, and Casualty_IMD_Decile were identified as significant influencers. Our research contributes to the nuanced understanding of traffic accident severity and demonstrates the potential of advanced statistical, econometric, machine learning techniques in informing evidence based interventions and policies for enhancing road safety.

Published in Heliyon

ISSN: 2405-8440 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General); Social Sciences: Social sciences (General)
Website: https://www.cell.com/heliyon/home

About the journal

Abstract

Keywords