Intelligent Systems with Applications (Nov 2022)
Identification and classification of road traffic incidents in Panama City through the analysis of a social media stream and machine learning
Abstract
In Panama City, Panama, as in many cities, the large number of cars on the roads and random traffic events produce constant and extensive traffic jams. These issues are usually not solved even with the construction of more traffic lanes. This work proposes the development of a system that allows the visualization of information published on social media about traffic incidents. Feature engineering methods, such as: Count Vectors and TF-IDF, were applied to process the tweets into structured data. Machine Learning models were created for the classification of traffic related tweets using SVM, Naïve Bayes, Random Forest and XGBoost. The prediction models resulted in two: a classification model that detects incident or non-incident tweets and a categorization model which determines the type of incident (accident, danger or obstacle). Results show that there were approximately 200,000 tweets reporting, traffic incidents since 2014 to 2022. In terms of the classification model, a precision of over 92% was achieved and for categorization over 97%. The best results were obtained with the use of Count Vectors and a Random Forest model. Finally, a graphical interface was developed to show the results of the obtained data and the streaming of live tweets, deployed at the website http://www.traficoya-pty.com/. This system has advantages such as speeding up the detection and visualization of traffic incidents, which can be of great help to the country’s traffic authorities and the general public.