Topic-aware neural attention network for malicious social media spam detection

Maged Nasser; Faisal Saeed; Aminu Da’u; Abdulaziz Alblwi; Mohammed Al-Sarem

Alexandria Engineering Journal (Jan 2025)

Topic-aware neural attention network for malicious social media spam detection

Maged Nasser,
Faisal Saeed,
Aminu Da’u,
Abdulaziz Alblwi,
Mohammed Al-Sarem

Affiliations

Maged Nasser: Computer & Information Sciences Department, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia
Faisal Saeed: College of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK; Corresponding author.
Aminu Da’u: Department of Computer Science, Hassan Usman Katsina Polytechnic, Katsina State, Nigeria
Abdulaziz Alblwi: Department of Computer Science, Applied College, Taibah University, Saudi Arabia
Mohammed Al-Sarem: College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia

Journal volume & issue: Vol. 111
pp. 540 – 554

Abstract

Read online

Social media platforms, such as Facebook and X (formally known as Twitter), have become indispensable tools in today's society because they facilitate social discussion and information sharing. This feature makes social networks more attractive for spammers who intentionally spread fake messages, post malicious links and spread rumours. Recently, several machine learning methods have been introduced for social network malicious spam classification. However, most existing methods generally rely on handcrafted features and traditional embedding models, which are relatively less effective. Therefore, inspired by the success of the neural attention network, we propose an interactive neural attention-based method for malicious spam detection by integrating long short-term memory (LSTM), topic modelling, and the BERT technique. In the proposed approach, first, we employed the LSTM encoder, which was integrated with the Twitter latent Dirichlet allocation (LDA) model via an interactive attention mechanism to jointly learn local content and global topic representations. Second, to further learn the contextualized features of texts, the model was further integrated with the BERT technique. Last, the Softmax function was then applied at the output layer for the final spam classification. A series of experiments were conducted utilizing two real-world datasets to evaluate the model. Using dataset 1, the proposed model outperformed the baseline techniques, with average improvements in recall, precision, and F1 and accuracies of 17.54 %, 6.19 %, 11.91 %, and 12.27 %, respectively. In addition, the proposed model performed well for the second dataset and obtained average gains of 11.81 %, 4.38 %, 8.12, and 7.42 in terms of recall, precision, F1, and accuracy, respectively.

Published in Alexandria Engineering Journal

ISSN: 1110-0168 (Print); 2090-2670 (Online)
Publisher: Elsevier
Country of publisher: Egypt
LCC subjects: Technology: Engineering (General). Civil engineering (General)
Website: http://www.journals.elsevier.com/alexandria-engineering-journal/

About the journal

Abstract

Keywords