IEEE Access (Jan 2020)

Word-Embedding-Based Traffic Document Classification Model for Detecting Emerging Risks Using Sentiment Similarity Weight

  • Min-Jeong Kim,
  • Ji-Soo Kang,
  • Kyungyong Chung

DOI
https://doi.org/10.1109/ACCESS.2020.3026585
Journal volume & issue
Vol. 8
pp. 183983 – 183994

Abstract

Read online

With the increase in traffic accident rates, traffic risk detection is becoming increasingly important. Moreover, it is necessary to provide appropriate traffic information considering user locations and routes and design an analysis method accordingly. This paper proposes a word-embedding-based traffic document classification model for detecting emerging risks using a quantity termed sentiment similarity weight (SSW). The proposed method detects emerging risks by considering and classifying the importance and polarity of keywords in traffic document. Conventional sentiment analysis methods fail to utilize semantically significant keywords unless they are included in a sentiment dictionary. In this study, through word imputation using an established similarity dictionary and by widening the limited utilization range, the proposed method overcomes the disadvantage of sentiment dictionaries. The proposed method is evaluated through three tests. In the first, the similarity between keywords is measured, and thus model accuracy is evaluated. In the second test, three classifiers for emerging risk classification are compared. In the last test, emerging risk detection is assessed according to whether the proposed SSW is applied, and its effectiveness is therefore verified. The evaluation results demonstrate that the proposed traffic-related document classification model using the SSW has an f-measure of 0.907, indicating satisfactory performance. Therefore, the proposed SSW can be effectively used as a parameter in traffic-related document classification and enables the detection of emerging risks.

Keywords