IEEE Access (Jan 2023)

Performance Evaluation of Phishing Classification Techniques on Various Data Sources and Schemes

  • Rahmad Abdillah,
  • Zarina Shukur,
  • Masnizah Mohd,
  • T. S. Mohd Zamri Murah,
  • Insu Oh,
  • Kangbin Yim

DOI
https://doi.org/10.1109/ACCESS.2022.3225971
Journal volume & issue
Vol. 11
pp. 38721 – 38738

Abstract

Read online

Phishing attacks have become a perilous threat in recent years, which has led to numerous studies to determine the classification technique that best detects these attacks. Several studies have made comparisons using only specific datasets and techniques without including the most crucial aspect, which is the performance evaluation of data changes. Hence, classification techniques cannot be generalized if they only use specific datasets and techniques. Therefore, this research determined the performance of classification techniques on changing data through a subset of schemes in a dataset. It was conducted using unbalanced and balanced phishing datasets, as well as subset schemes in ratios of 90:10, 80:20, 70:30, and 60:40. The thirteen most recent classification techniques used in preliminary phishing studies were compared and evaluated against ten performance measures. The results showed that the proposed schemes successfully uncover the maximum and minimum performance obtained by a classification technique. These comparisons can provide deeper insights into phishing classification techniques than related research.

Keywords