Symmetry (Jan 2023)

A Novel Phishing Website Detection Model Based on LightGBM and Domain Name Features

  • Jingxian Zhou,
  • Haibin Cui,
  • Xina Li,
  • Wenjin Yang,
  • Xi Wu

DOI
https://doi.org/10.3390/sym15010180
Journal volume & issue
Vol. 15, no. 1
p. 180

Abstract

Read online

Phishing attacks have evolved in terms of sophistication and have increased in sheer number in recent years. This has led to corresponding developments in the methods used to evade the detection of phishing attacks, which pose daunting challenges to the privacy and security of the users of smart systems. This study uses LightGBM and features of the domain name to propose a machine-learning-based method to identify phishing websites and maintain the security of smart systems. Domain name features, often known as symmetry, are the property wherein multiple domain-name-generation algorithms remain constant. The proposed model of detection is first used to extract features of the domain name of the given website, including character-level features and information on the domain name. The features are filtered to improve the model’s accuracy and are subsequently used for classification. The results of experimental comparisons showed that the proposed model of detection, which integrates two types of features for training, significantly outperforms the model that uses a single type of feature. The proposed method also has a higher detection accuracy than other methods and is suitable for the real-time detection of many phishing websites.

Keywords