Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning

Lan Liu; Pengcheng Wang; Jun Lin; Langzhou Liu

doi:10.1109/ACCESS.2020.3048198

IEEE Access (Jan 2021)

Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning

Lan Liu,
Pengcheng Wang,
Jun Lin,
Langzhou Liu

Affiliations

Lan Liu: ORCiD; School of Electronic and Information Engineering, Guangdong Polytechnic Normal University, Guangzhou, China
Pengcheng Wang: ORCiD; School of Electronic and Information Engineering, Guangdong Polytechnic Normal University, Guangzhou, China
Jun Lin: ORCiD; China Electronic Product Reliability and Environmental Testing Research Institute, Guangzhou, China
Langzhou Liu: ORCiD; School of Electronic and Information Engineering, Guangdong Polytechnic Normal University, Guangzhou, China

DOI: https://doi.org/10.1109/ACCESS.2020.3048198
Journal volume & issue: Vol. 9
pp. 7550 – 7563

Abstract

Read online

In imbalanced network traffic, malicious cyber-attacks can often hide in large amounts of normal data. It exhibits a high degree of stealth and obfuscation in cyberspace, making it difficult for Network Intrusion Detection System(NIDS) to ensure the accuracy and timeliness of detection. This paper researches machine learning and deep learning for intrusion detection in imbalanced network traffic. It proposes a novel Difficult Set Sampling Technique(DSSTE) algorithm to tackle the class imbalance problem. First, use the Edited Nearest Neighbor(ENN) algorithm to divide the imbalanced training set into the difficult set and the easy set. Next, use the KMeans algorithm to compress the majority samples in the difficult set to reduce the majority. Zoom in and out the minority samples' continuous attributes in the difficult set synthesize new samples to increase the minority number. Finally, the easy set, the compressed set of majority in the difficult, and the minority in the difficult set are combined with its augmentation samples to make up a new training set. The algorithm reduces the imbalance of the original training set and provides targeted data augment for the minority class that needs to learn. It enables the classifier to learn the differences in the training stage better and improve classification performance. To verify the proposed method, we conduct experiments on the classic intrusion dataset NSL-KDD and the newer and comprehensive intrusion dataset CSE-CIC-IDS2018. We use classical classification models: random forest(RF), Support Vector Machine(SVM), XGBoost, Long and Short-term Memory(LSTM), AlexNet, Mini-VGGNet. We compare the other 24 methods; the experimental results demonstrate that our proposed DSSTE algorithm outperforms the other methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords