Sistemas de Informação (Dec 2023)

Sub-language Sentiment Analysis in WhatsApp Domain with Deep Learning Approaches

  • Morais, L. P.,
  • Soares, A. S.,
  • Borges, V. C. M.,
  • Silva, N. F. F.,
  • PEREIRA, F. S. F.

Journal volume & issue
Vol. 1, no. 31
pp. 32 – 47

Abstract

Read online

Sentiment analysis approaches have offered a useful tool for decision support systems in various fields, including politics, network management, marketing, and healthcare. Owing to the increasing impact of online social networks on these fields and the fact that they are rich sources of information, the current sentiment analysis techniques in this scenario have evolved successfully. WhatsApp is a social network platform that enables users interact with close ties in a particular manner to communicate more meaningful, genuine, tangible, and personal information to the recipient, such as a sentiment. Hence, WhatsApp domain can be defined as a sub-language of the first language. However, only few studies have focused on WhatsApp sentiment analysis. These works usually employ outdated sentiment lexicon techniques and do not assess the most modern techniques based on deep learning. This study aims to evaluate this techniques for sentiment analysis based on deep neural networks and transfer learning, considering the intrinsic features of sub-language in WhatsApp domain. BERT1 and ALBERT1 (transfer learning approaches) achieve the best performance in accuracy and F1 (88% on average for both metrics and classifiers) similarly to other domains (Twitter). Although DCNN and LSTM with static embeddings usually achieve good performance when they are pre-trained on a larger corpus of other domains, these approaches reach the worst performance for WhatsApp domain. Furthermore, ELMo provides a good trade-off between the accuracy and training time complexity, mainly when taking into account the small size of our corpus training of WhatsApp. Hence, it can be inferred that the specific characteristics of the WhatsApp sub-language has an impact on the performance of some traditional SA classifiers

Keywords