IEEE Open Journal of the Communications Society (Jan 2024)
Few-Shot Class-Incremental Learning for Network Intrusion Detection Systems
Abstract
In today’s digital landscape, critical services are increasingly dependent on network connectivity, thus cybersecurity has become paramount. Indeed, the constant escalation of cyberattacks, including zero-day exploits, poses a significant threat. While Network Intrusion Detection Systems (NIDSs) leveraging machine-learning and deep-learning models have proven effective in recent studies, they encounter limitations such as the need for abundant samples of malicious traffic and full retraining upon encountering new attacks. These limitations hinder their adaptability in real-world scenarios. To address these challenges, we design a novel NIDS capable of promptly adapting to classify new attacks and provide timely predictions. Our proposal for attack-traffic classification adopts Few-Shot Class-Incremental Learning (FSCIL) and is based on the Rethinking Few-Shot (RFS) approach, which we experimentally prove to overcome other FSCIL state-of-the-art alternatives based on either meta-learning or transfer learning. We evaluate the proposed NIDS across a wide array of cyberattacks whose traffic is collected in recent publicly available datasets to demonstrate its robustness across diverse network-attack scenarios, including malicious activities in an Internet-of-Things context and cyberattacks targeting servers. We validate various design choices as well, involving the number of traffic samples per attack available, the impact of the features used to represent the traffic objects, and the time to deliver the classification verdict. Experimental results witness that our proposed NIDS effectively retains previously acquired knowledge (with over 94% F1-score) while adapting to new attacks with only few samples available (with over 98% F1-score). Thus, it outperforms non-FSCIL state of the art in terms of classification effectiveness and adaptation time. Moreover, our NIDS exhibits high performance even with traffic collected within short time frames, achieving 95% F1-score while reducing the time-to-insight. Finally, we identify possible limitations likely arising in specific application contexts and envision promising research avenues to mitigate them.
Keywords