Jurnal Sisfokom (Nov 2024)

Enhancing XGBoost Performance in Malware Detection through Chi-Squared Feature Selection

  • Salma Rosyada,
  • Fauzi Adi Rafrastara,
  • Arsabilla Ramadhani,
  • Wildanil Ghozi,
  • Warusia Yassin

DOI
https://doi.org/10.32736/sisfokom.v13i3.2293
Journal volume & issue
Vol. 13, no. 3
pp. 396 – 402

Abstract

Read online

The increasing prevalence of malware poses significant risks, including data loss and unauthorized access. These threats manifest in various forms, such as viruses, Trojans, worms, and ransomware. Each continually evolves to exploit system vulnerabilities. Ransomware has seen a particularly rapid increase, as evidenced by the devastating WannaCry attack of 2017 which crippled critical infrastructure and caused immense economic damage. Due to their heavy reliance on signature-based techniques, traditional anti-malware solutions struggle to keep pace with malware's evolving nature. However, these techniques face limitations, as even slight code modifications can allow malware to evade detection. Consequently, this highlights weaknesses in current cybersecurity defenses and underscores the need for more sophisticated detection methods. To address these challenges, this study proposes an enhanced malware detection approach utilizing Extreme Gradient Boosting (XGBoost) in conjunction with Chi-Squared Feature Selection. The research applied XGBoost to a malware dataset and implemented preprocessing steps such as class balancing and feature scaling. Furthermore, the incorporation of Chi-Squared Feature Selection improved the model's accuracy from 99.1% to 99.2% and reduced testing time by 89.28%, demonstrating its efficacy and efficiency. These results confirm that prioritizing relevant features enhances both the accuracy and computational speed of the model. Ultimately, combining feature selection with machine learning techniques proves effective in addressing modern malware detection challenges, not only enhancing accuracy but also expediting processing times.

Keywords