Data Discretization and Decision Boundary Data Point Analysis for Unknown Attack Detection

Gun-Yoon Shin; Dong-Wook Kim; Myung-Mook Han

doi:10.1109/ACCESS.2022.3215269

IEEE Access (Jan 2022)

Data Discretization and Decision Boundary Data Point Analysis for Unknown Attack Detection

Gun-Yoon Shin,
Dong-Wook Kim,
Myung-Mook Han

Affiliations

Gun-Yoon Shin: ORCiD; Department of Computer Engineering, Gachon University, Sungnam-si, South Korea
Dong-Wook Kim: ORCiD; Department of Computer Engineering, Gachon University, Sungnam-si, South Korea
Myung-Mook Han: Department of AI Software, Gachon University, Sungnam-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3215269
Journal volume & issue: Vol. 10
pp. 114008 – 114015

Abstract

Read online

Researchers have continuously sought effective ways to detect unknown (zero-day) cyberattacks in real time. Most current methods rely on pattern-recognition to identify known threats when they appear. Recently, machine learning anomaly detection tools that train a model on normal network data have been used to identify outliers representing unknown attacks. However, detecting unknown attacks is difficult because of a lack of information on unknown attacks, class imbalance in the data, or failure to accurately detect attacks with normal patterns. To overcome these problems, this study applied data discretization and decision-boundary data point analyses to scrutinize patterns near the thresholds of uncertainty. A novel discretization method was used to effectively train a model for the fuzzy c-means feature analysis of data points at the decision boundary, through which adversarial features were detected and classified based on their entropy. Consequently, it was possible to identify incorrectly detected attack data distributed near the model’s decision boundary. The NSL-KDD dataset, which is commonly used to evaluate ML intrusion detection systems, was used to evaluate the proposed method. The results showed that our model successfully identified attacks at the decision boundary and that its performance can be improved through classification. In addition, after classification, it was confirmed that the accuracy of detecting DoS attacks improved by 5 to 7%, Probe by 7 to 10%, R2L by 4 to 7%, and U2R by 1 to 9%, compared with that of existing models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords