IATSS Research (Oct 2022)
Road traffic conditions in Kenya: Exploring the policies and traffic cultures from unstructured user-generated data using NLP
Abstract
Road traffic accidents (RTA) are a prevalent cause of fatality with African countries having the highest fatality index (25–34 per quota). The World Health Organization estimates Kenya's fatality rate due to RTA at 28 per quota. From literature, the country's fatality and injuries have increased by 26% and 46.5%, respectively, since the year 2015. The country is faced with incomplete RTA data capturing, hindering effective planning and policy adjustments to curb the menace. In this paper, we scrapped user-generated data (Twitter) and national transport and safety authority's (NTSA) reports to shed light on traffic safety, practices, and cultures in the country. To this end, we gathered 1,000,000 tweets and 8000 speeding entries between 2015 and 2021 and performed natural language processing (NLP) and quantitative study of the data. We applied NLP and n-gram search of keywords to categorize data into 8 topics: traffic, public service vehicle (PSVs), policing, accident, infrastructure, recklessness, robbery, and corruption. From the data, policing, which touches on all police and law-enforcement-related activity was found to be highly correlated with PSVs, recklessness, accidents, traffic congestion, robbery, infrastructure, and corruption with indices of r(76) = 0.92, 0. 91, 0.87, 0.82, 0.81, 0.76, and 0.70, respectively with p < 0.001. The topic modeling confirmed the identified topics to be the latent discussion issues affecting the public. From the study, PSVs, policing and traffic flow were isolated as key issues that ought to be addressed immediately. The research recommended the integration of driver monitoring systems to strengthen policing. The research, which utilized unstructured data, points to the utility of data mining which would greatly benefit traffic research, particularly African-based studies, that suffer from data inadequacy.