Alexandria Engineering Journal (Apr 2023)

Deep learning-based spam image filtering

  • Wessam M. Salama,
  • Moustafa H. Aly,
  • Yasmine Abouelseoud

Journal volume & issue
Vol. 68
pp. 461 – 468

Abstract

Read online

Spam is some unwanted material that may be put in the form of images. While many machine learning approaches are effective at detecting textual spam, this is not true for image spam. In this paper, a new framework for identifying image spams is proposed. The images are divided into two categories: undesirable material contained in the form of images which is referred to as a spam image, whereas anything else is referred to as a ham image. Our proposed methodology is based on applying different pre-trained deep learning models, including InceptionV3, Densely Connected Convolutional Networks 121(DenseNet121), Residual Networks (ResNet50), Visual Geometry Group (VGG16) and MobileNetV2, to filter out the unwanted spam images. Different standard test datasets such as Dredze Dataset, Image Spam Hunter (ISH) Dataset and Improved Dataset are utilized in this paper for performance testing. Furthermore, transfer learning and data augmentation are employed to address the issue of a shortage of labeled data. In our implementation, the fully connected (FC) layer in the aforementioned pre-trained models is replaced with a Support Vector Machine (SVM) classifier, resulting in an improved accuracy. The obtained results reveal that ResNet50 model yields the best performance achieving 99.87% accuracy, 99.88% area under the curve (AUC), 99.98% sensitivity, 99.79% precision, 98.99% F1 score and a computational testing time of in the order of one to two seconds for the ISH dataset.

Keywords