Applied Sciences (Jul 2024)

Enhanced Feature Selection Using Genetic Algorithm for Machine-Learning-Based Phishing URL Detection

  • Emre Kocyigit,
  • Mehmet Korkmaz,
  • Ozgur Koray Sahingoz,
  • Banu Diri

DOI
https://doi.org/10.3390/app14146081
Journal volume & issue
Vol. 14, no. 14
p. 6081

Abstract

Read online

In recent years, the importance of computer security has increased due to the rapid advancement of digital technology, widespread Internet use, and increased sophistication of cyberattacks. Machine learning has gained great interest in securing data systems because it offers the capability of automatically detecting and responding to security threats in real time, which is crucial for maintaining the security of computer systems and protecting data from malicious attacks. This study concentrates on phishing attack detection systems, a prevalent cyber-threat. These systems assess the features of the incoming requests to identify whether they are malicious or not. Although the number of features is increasing in these systems, feature selection has become an essential pre-processing phase that identifies the most important features of a set of available features to prevent overfitting problems, improve model performance, reduce computational cost, and decrease training and execution time. Leveraging genetic algorithms, known for simulating natural selection to identify optimal solutions, we propose a novel feature selection method, based on genetic algorithms and locally optimized, that is applied to a URL-based phishing detection system with machine learning models. Our research demonstrates that the proposed technique offers a promising strategy for improving the performance of machine learning models.

Keywords