Improving Data Generalization With Variational Autoencoders for Network Traffic Anomaly Detection

Mehrnoosh Monshizadeh; Vikramajeet Khatri; Marah Gamdou; Raimo Kantola; Zheng Yan

doi:10.1109/ACCESS.2021.3072126

IEEE Access (Jan 2021)

Improving Data Generalization With Variational Autoencoders for Network Traffic Anomaly Detection

Mehrnoosh Monshizadeh,
Vikramajeet Khatri,
Marah Gamdou,
Raimo Kantola,
Zheng Yan

Affiliations

Mehrnoosh Monshizadeh: ORCiD; Nokia Bell Labs, Nozay, France
Vikramajeet Khatri: ORCiD; Nokia Bell Labs, Espoo, Finland
Marah Gamdou: ORCiD; Centralesupélec Engineering School, Paris-Saclay University, Gif-sur-Yvette, France
Raimo Kantola: Department of Comnet, Aalto University, Espoo, Finland
Zheng Yan: Department of Comnet, Aalto University, Espoo, Finland

DOI: https://doi.org/10.1109/ACCESS.2021.3072126
Journal volume & issue: Vol. 9
pp. 56893 – 56907

Abstract

Read online

Deep generative models have increasingly become popular in different domains such as image processing, though, they hardly appear in the cybersecurity arena. While the main application of these models is dimensionality reduction, marginally they have been utilized for overcoming challenges such as data generalization and overfitting issues inherited from feature selection methods. To solve the mentioned challenges, we propose a combined architecture comprising a Conditional Variational AutoEncoder (CVAE) and a Random Forest (RF) classifier to automatically learn similarity among input features, provide data distribution in order to extract discriminative features from original features, and finally classify various types of attacks. CVAE introduces the labels of traffic packets into a latent space in order to better learn the changes of input samples and distinguish the data characteristics of each class. It avoids the confusion between classes while learning the whole data distribution. Compared with feature selection mechanisms such as Support Vector Machine Online (SVMo) by considering various evaluation metrics, the proposed architecture demonstrates considerable improvement in terms of performance. To verify the versatility of the proposed architecture, two publicly available datasets have been used in experiments.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords