Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Comparing Autoencoder and Isolation Forest in Network Anomaly Detection

  • Timotej Smolen,
  • Lenka Benova

DOI
https://doi.org/10.23919/FRUCT58615.2023.10143005
Journal volume & issue
Vol. 33, no. 1
pp. 276 – 282

Abstract

Read online

Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.

Keywords