Deep learning-based spam image filtering

Wessam M. Salama; Moustafa H. Aly; Yasmine Abouelseoud

Alexandria Engineering Journal (Apr 2023)

Deep learning-based spam image filtering

Wessam M. Salama,
Moustafa H. Aly,
Yasmine Abouelseoud

Affiliations

Wessam M. Salama: Department of Basic Sciences, Faculty of Engineering, Pharos University, Alexandria, Egypt
Moustafa H. Aly: Department of Electronics and Communications Engineering, College of Engineering Technology, Arab Academy for Science, Technology and Maritime Transport, Alexandria, Egypt; Corresponding author.
Yasmine Abouelseoud: Department of Engineering Mathematics and Physics, Faculty of Engineering, Alexandria University, Egypt

Journal volume & issue: Vol. 68
pp. 461 – 468

Abstract

Read online

Spam is some unwanted material that may be put in the form of images. While many machine learning approaches are effective at detecting textual spam, this is not true for image spam. In this paper, a new framework for identifying image spams is proposed. The images are divided into two categories: undesirable material contained in the form of images which is referred to as a spam image, whereas anything else is referred to as a ham image. Our proposed methodology is based on applying different pre-trained deep learning models, including InceptionV3, Densely Connected Convolutional Networks 121(DenseNet121), Residual Networks (ResNet50), Visual Geometry Group (VGG16) and MobileNetV2, to filter out the unwanted spam images. Different standard test datasets such as Dredze Dataset, Image Spam Hunter (ISH) Dataset and Improved Dataset are utilized in this paper for performance testing. Furthermore, transfer learning and data augmentation are employed to address the issue of a shortage of labeled data. In our implementation, the fully connected (FC) layer in the aforementioned pre-trained models is replaced with a Support Vector Machine (SVM) classifier, resulting in an improved accuracy. The obtained results reveal that ResNet50 model yields the best performance achieving 99.87% accuracy, 99.88% area under the curve (AUC), 99.98% sensitivity, 99.79% precision, 98.99% F1 score and a computational testing time of in the order of one to two seconds for the ISH dataset.

Published in Alexandria Engineering Journal

ISSN: 1110-0168 (Print); 2090-2670 (Online)
Publisher: Elsevier
Country of publisher: Egypt
LCC subjects: Technology: Engineering (General). Civil engineering (General)
Website: http://www.journals.elsevier.com/alexandria-engineering-journal/

About the journal

Abstract

Keywords