AIP Advances (Jan 2025)

Comprehensive approach to predictive analysis and anomaly detection for road crash fatalities

  • Chopparapu Gowthami,
  • S. Kavitha

DOI
https://doi.org/10.1063/5.0251493
Journal volume & issue
Vol. 15, no. 1
pp. 015022 – 015022-11

Abstract

Read online

Since traffic accidents are a major global cause of injury and death, it is essential to comprehend and reduce their effects. Finding high-risk areas and creating focused interventions to increase road safety are made possible by the research’s analysis of numerous variables that affect the number of fatalities in traffic crashes, including weather, road features, and geographic locations. To further contribute to the overall objective of building safer transportation networks for everyone, the application of predictive models and anomaly detection techniques enables proactive steps to avert collisions and lower the number of fatalities on our roadways. With the main objective of improving road safety, a thorough approach was put into place to evaluate data from traffic crashes, forecast deaths, and identify abnormalities. Using a multimodal method, the research first combines two datasets based on geographic coordinates: crash data and traffic count data. This integration makes it easier to grasp the various aspects that contribute to traffic accidents comprehensively. These factors include weather, road features, and geographic regions. A Random Forest Regression model is trained to estimate the number of deaths arising from traffic crashes after data preprocessing, which includes feature selection and encoding. The accuracy and predictive power of the model are assessed through the utilization of the Mean Squared Error measure. To determine the most important variables impacting traffic crashes, feature importance analysis is also carried out. To find anomalies or outliers in the data and take preventative action to reduce the impact of accidents, anomaly detection utilizing an Isolation Forest model is utilized. Through the possibility of highlighting regions with increased risk or problems with data quality, this part of the research improves our comprehension of unexpected events in accident data. For comparison analysis, other models such as Auto Regressive Integrated Moving Average and Support Vector Regression are used in addition to the Random Forest Regression model. The root mean squared error statistic is used to analyze these models’ performance and applicability in real-world scenarios. They provide different viewpoints on the prediction of mortality from traffic accidents. The study’s findings highlight the significance of using data-driven strategies to successfully solve issues related to road safety. The research offers policymakers, transportation authorities, and safety advocates practical insights by utilizing sophisticated machine-learning algorithms and integrating multiple datasets. Road crash fatalities can be decreased and safer transportation systems can be established by using the predictive models that have been created as a proactive tool for identifying high-risk regions and allocating resources for targeted improvements. To enhance road safety results, the research emphasizes the need for interdisciplinary partnerships and data-driven decision making. The findings open the door for evidence-based initiatives to lessen the effects of traffic accidents and save lives on our roads by utilizing data analytics and predictive modeling.