Future Internet (Jan 2024)
Classification Tendency Difference Index Model for Feature Selection and Extraction in Wireless Intrusion Detection
Abstract
In detecting large-scale attacks, deep neural networks (DNNs) are an effective approach based on high-quality training data samples. Feature selection and feature extraction are the primary approaches for data quality enhancement for high-accuracy intrusion detection. However, their enhancement root causes usually present weak relationships to the differences between normal and attack behaviors in the data samples. Thus, we propose a Classification Tendency Difference Index (CTDI) model for feature selection and extraction in intrusion detection. The CTDI model consists of three indexes: Classification Tendency Frequency Difference (CTFD), Classification Tendency Membership Difference (CTMD), and Classification Tendency Distance Difference (CTDD). In the dataset, each feature has many feature values (FVs). In each FV, the normal and attack samples indicate the FV classification tendency, and CTDI shows the classification tendency differences between the normal and attack samples. CTFD is the frequency difference between the normal and attack samples. By employing fuzzy C means (FCM) to establish the normal and attack clusters, CTMD is the membership difference between the clusters, and CTDD is the distance difference between the cluster centers. CTDI calculates the index score in each FV and summarizes the scores of all FVs in the feature as the feature score for each of the three indexes. CTDI adopts an Auto Encoder for feature extraction to generate new features from the dataset and calculate the three index scores for the new features. CTDI sorts the original and new features for each of the three indexes to select the best features. The selected CTDI features indicate the best classification tendency differences between normal and attack samples. The experiment results demonstrate that the CTDI features achieve better detection accuracy as classified by DNN for the Aegean WiFi Intrusion Dataset than their related works, and the detection enhancements are based on the improved classification tendency differences in the CTDI features.
Keywords