Applied Sciences (Nov 2024)
Deep Learning Classification of Traffic-Related Tweets: An Advanced Framework Using Deep Learning for Contextual Understanding and Traffic-Related Short Text Classification
Abstract
Classifying social media (SM) messages into relevant or irrelevant categories is challenging due to data sparsity, imbalance, and ambiguity. This study aims to improve Intelligent Transport Systems (ITS) by enhancing short text classification of traffic-related SM data. Deep learning methods such as RNNs, CNNs, and BERT are effective at capturing context, but they can be computationally expensive, struggle with very short texts, and perform poorly with rare words. On the other hand, transfer learning leverages pre-trained knowledge but may be biased towards the pre-training domain. To address these challenges, we propose DLCTC, a novel system combining character-level, word-level, and context features with BiLSTM and TextCNN-based attention. By utilizing external knowledge, DLCTC ensures an accurate understanding of concepts and abbreviations in traffic-related short texts. BiLSTM captures context and term correlations; TextCNN captures local patterns. Multi-level attention focuses on important features across character, word, and concept levels. Experimental studies demonstrate DLCTC’s effectiveness over well-known short-text classification approaches based on CNN, RNN, and BERT.
Keywords