Classification Tendency Difference Index Model for Feature Selection and Extraction in Wireless Intrusion Detection

Chinyang Henry Tseng; Woei-Jiunn Tsaur; Yueh-Mao Shen

doi:10.3390/fi16010025

Future Internet (Jan 2024)

Classification Tendency Difference Index Model for Feature Selection and Extraction in Wireless Intrusion Detection

Chinyang Henry Tseng,
Woei-Jiunn Tsaur,
Yueh-Mao Shen

Affiliations

Chinyang Henry Tseng: Department of Computer Science and Information Engineering, National Taipei University, New Taipei City 23741, Taiwan
Woei-Jiunn Tsaur: Computer Center, National Taipei University, New Taipei City 23741, Taiwan
Yueh-Mao Shen: College of Electrical Engineering and Computer Science, National Taipei University, New Taipei City 23741, Taiwan

DOI: https://doi.org/10.3390/fi16010025
Journal volume & issue: Vol. 16, no. 1
p. 25

Abstract

Read online

In detecting large-scale attacks, deep neural networks (DNNs) are an effective approach based on high-quality training data samples. Feature selection and feature extraction are the primary approaches for data quality enhancement for high-accuracy intrusion detection. However, their enhancement root causes usually present weak relationships to the differences between normal and attack behaviors in the data samples. Thus, we propose a Classification Tendency Difference Index (CTDI) model for feature selection and extraction in intrusion detection. The CTDI model consists of three indexes: Classification Tendency Frequency Difference (CTFD), Classification Tendency Membership Difference (CTMD), and Classification Tendency Distance Difference (CTDD). In the dataset, each feature has many feature values (FVs). In each FV, the normal and attack samples indicate the FV classification tendency, and CTDI shows the classification tendency differences between the normal and attack samples. CTFD is the frequency difference between the normal and attack samples. By employing fuzzy C means (FCM) to establish the normal and attack clusters, CTMD is the membership difference between the clusters, and CTDD is the distance difference between the cluster centers. CTDI calculates the index score in each FV and summarizes the scores of all FVs in the feature as the feature score for each of the three indexes. CTDI adopts an Auto Encoder for feature extraction to generate new features from the dataset and calculate the three index scores for the new features. CTDI sorts the original and new features for each of the three indexes to select the best features. The selected CTDI features indicate the best classification tendency differences between normal and attack samples. The experiment results demonstrate that the CTDI features achieve better detection accuracy as classified by DNN for the Aegean WiFi Intrusion Dataset than their related works, and the detection enhancements are based on the improved classification tendency differences in the CTDI features.

Published in Future Internet

ISSN: 1999-5903 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/futureinternet/

About the journal

Abstract

Keywords