IEEE Access (Jan 2022)
Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
Abstract
Wi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to detect Wi-Fi attacks is lagging behind. On top of that, the field is dominated by false or half-true assumptions that potentially can lead to corresponding models being overfilled to certain validation datasets, simply giving the impression or illusion of high efficiency. This work attempts to provide concrete answers to the following key questions regarding IEEE 802.11 machine learning driven IDS. First, from an expert’s viewpoint and with reference to the relevant literature, what are the criteria for determining the smallest possible set of classification features, which are also common and potentially transferable to virtually any deployment types/versions of 802.11? And second, based on these features, what is the detection performance across different network versions and diverse machine learning techniques, i.e., shallow versus deep learning ones? To answer these questions, we rely on the renowned 802.11 security-oriented AWID family of datasets. In a nutshell, our experiments demonstrate that with a rather small set of 16 features and without the use of any optimization or ensemble method, shallow and deep learning classification can achieve an average F1 score of up to 99.55% and 97.55%, respectively. We argue that the suggested human expert driven feature selection leads to lightweight, deployment-agnostic detection systems, and therefore can be used as a basis for future work in this interesting and rapidly evolving field.
Keywords