Journal of Big Data (Sep 2023)

MAFSIDS: a reinforcement learning-based intrusion detection model for multi-agent feature selection networks

  • Kezhou Ren,
  • Yifan Zeng,
  • Yuanfu Zhong,
  • Biao Sheng,
  • Yingchao Zhang

DOI
https://doi.org/10.1186/s40537-023-00814-4
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 30

Abstract

Read online

Abstract Large unbalanced datasets pose challenges for machine learning models, as redundant and irrelevant features can hinder their effectiveness. Furthermore, the performance of intrusion detection systems (IDS) can be further degraded by the emergence of new network attack types. To address these issues, we propose MAFSIDS (Multi-Agent Feature Selection Intrusion Detection System), a DQL (Deep Q-Learning) based IDS. MAFSIDS comprises a feature self-selection algorithm and a DRL (Deep Reinforcement Learning) attack detection module. The feature self-selection algorithm leverages a multi-agent reinforcement learning framework, which redefines the feature selection problem by converting the traditional $${2}^{N}$$ 2 N feature selection space into $$N$$ N agent representations. This approach reduces model complexity and enhances the search strategy for feature selection. To ensure accurate feature representation and expedite the feature selection process, we have also developed a GCN (Graph Convolutional Network) method that extracts deeper features from the data. The DRL attack detection module utilizes the Mini-Batchs technique to encode the data, allowing reinforcement learning to be applied in a supervised learning context. This integration improves accuracy. Additionally, the policy network in this module is designed to be minimalist, enhancing model efficiency. To evaluate the performance of our model, we conducted comprehensive simulation experiments using Python. We tested the model using the CSE-CIC-IDS2018 and NSL-KDD datasets, achieving impressive accuracy rates of 96.8% and 99.1%, as well as F1-Scores of 96.3% and 99.1%, respectively. The selected feature subset successfully eliminates approximately 80% of redundant features compared to the original feature set. Furthermore, we compared our proposed model with other popular machine-learning models.