IEEE Access (Jan 2023)

A URL-Based Social Semantic Attacks Detection With Character-Aware Language Model

  • May Almousa,
  • Mohd Anwar

DOI
https://doi.org/10.1109/ACCESS.2023.3241121
Journal volume & issue
Vol. 11
pp. 10654 – 10663

Abstract

Read online

Social engineering attacks rely on human errors and behavioral choices. The semantic attack, a subcategory of social engineering attacks, utilizes behavioral or cosmetic deception vectors (e.g., attacker creates a malicious website that looks like and behaves like the legitimate one). The most common types of social semantic attacks include phishing, spamming, defacement, and malware attacks. We investigate the feasibility of developing URL-based social semantic attack detection models utilizing character-aware language models. Specifically, we developed three types of models: long short-term memory (LSTM)-based detection model, convolutional neural network (CNN)-based detection model, and CharacterBERT-based detection model. We benchmarked performances of different models for different attacks. Using the characterBERT-based detection model, the overall evaluation recorded a high detection accuracy of 99.65% by averaging the results of performing a 5-fold cross-validation. Considering the model performance per class, the CharacterBERT model ranked the best model among our 3 models in detecting the social semantic attacks, reaching best accuracy of 99.90% in detecting defacement attack.

Keywords