Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

Timotej Smolen; Lenka Benova

doi:10.23919/FRUCT58615.2023.10143005

Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

Timotej Smolen,
Lenka Benova

Affiliations

Timotej Smolen: Slovak University of Technology, Faculty of Informatics and Information Technologies
Lenka Benova: Slovak University of Technology, Faculty of Informatics and Information Technologies

DOI: https://doi.org/10.23919/FRUCT58615.2023.10143005
Journal volume & issue: Vol. 33, no. 1
pp. 276 – 282

Abstract

Read online

Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.

anomaly detection network traffic intrusion detection autoencoder isolation forest lstm unsupervised learning web server logs

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords