Alexandria Engineering Journal (Jan 2025)

Topic-aware neural attention network for malicious social media spam detection

  • Maged Nasser,
  • Faisal Saeed,
  • Aminu Da’u,
  • Abdulaziz Alblwi,
  • Mohammed Al-Sarem

Journal volume & issue
Vol. 111
pp. 540 – 554

Abstract

Read online

Social media platforms, such as Facebook and X (formally known as Twitter), have become indispensable tools in today's society because they facilitate social discussion and information sharing. This feature makes social networks more attractive for spammers who intentionally spread fake messages, post malicious links and spread rumours. Recently, several machine learning methods have been introduced for social network malicious spam classification. However, most existing methods generally rely on handcrafted features and traditional embedding models, which are relatively less effective. Therefore, inspired by the success of the neural attention network, we propose an interactive neural attention-based method for malicious spam detection by integrating long short-term memory (LSTM), topic modelling, and the BERT technique. In the proposed approach, first, we employed the LSTM encoder, which was integrated with the Twitter latent Dirichlet allocation (LDA) model via an interactive attention mechanism to jointly learn local content and global topic representations. Second, to further learn the contextualized features of texts, the model was further integrated with the BERT technique. Last, the Softmax function was then applied at the output layer for the final spam classification. A series of experiments were conducted utilizing two real-world datasets to evaluate the model. Using dataset 1, the proposed model outperformed the baseline techniques, with average improvements in recall, precision, and F1 and accuracies of 17.54 %, 6.19 %, 11.91 %, and 12.27 %, respectively. In addition, the proposed model performed well for the second dataset and obtained average gains of 11.81 %, 4.38 %, 8.12, and 7.42 in terms of recall, precision, F1, and accuracy, respectively.

Keywords