IEEE Access (Jan 2019)

<italic>Face off:</italic> Travel Habits, Road Conditions and Traffic City Characteristics Bared Using Twitter

  • Amit Agarwal,
  • Durga Toshniwal

DOI
https://doi.org/10.1109/ACCESS.2019.2917159
Journal volume & issue
Vol. 7
pp. 66536 – 66552

Abstract

Read online

The adequacy of traditional transport related issues detection is often limited by physical sparse sensor coverage and reporting incident/issues to the emergency response system is labor intensive. The social media tweet text have been mined so as to identify the complaints regarding various road transportation issues of traffic, accident, and potholes. In order to identify and segregate tweets related to different issues, keyword-based approaches have been used previously, but these methods are solely dependent on seed keywords which are manually given and these set of keywords are not sufficient to cover all tweets posts. So, to overcome this issue, a novel approach has been proposed that captures the semantic context through dense word embedding by employing word2vec model. However, the process of tweet segregation on the basis of semantic similar keywords may suffer from the problem of pragmatic ambiguity. To handle this, Word2Vec model has been applied to match the semantically similar tweets with respect to each category. Furthermore, the hotspots have been identified corresponding to each category. However, due to the scarcity of geo-tagged tweets, we have proposed a hybrid method which amalgamates Named Entity Recognition (NER), Part of speech (POS), and Regular Expression (RE) to extract the location information from the tweet textual content. Due to the lack of availability of the ground truth dataset, model feasibility has been validated from the existing data records (i.e., published by government official accounts and reported on news media) and the evaluation results signify that the stated approach identifies few additional hotspots as compared to the existing reports while analyzing the tweets.

Keywords