Anomaly Detection for HTTP Using Convolutional Autoencoders

Seungyoung Park; Myungjin Kim; Seokwoo Lee

doi:10.1109/ACCESS.2018.2881003

IEEE Access (Jan 2018)

Anomaly Detection for HTTP Using Convolutional Autoencoders

Seungyoung Park,
Myungjin Kim,
Seokwoo Lee

Affiliations

Seungyoung Park: ORCiD; Department of Electrical and Electronics Engineering, Kangwon National University, Chuncheon, South Korea
Myungjin Kim: Penta Security Systems Inc., Seoul, South Korea
Seokwoo Lee: Penta Security Systems Inc., Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2018.2881003
Journal volume & issue: Vol. 6
pp. 70884 – 70901

Abstract

Read online

Hypertext transfer protocol (HTTP) intrusion has long been a major issue in network security. Anomaly detection methods for detecting such intrusions have been shown to be highly effective, as they learn patterns from the characteristics of normal HTTP messages and search for deviations to detect anomalous messages. Various anomaly detection schemes have been proposed using deep learning algorithms, which require a set of input features to represent an HTTP message. However, heuristically selected input features result in limited performance owing to their lack of understanding of HTTP messages. Recently, it has been shown that documents can be successfully classified by binary images transformed from documents at the character level as the input features for a convolutional neural network (CNN). Thus, document classification is possible without any prior knowledge of words, syntactics, or semantics. This motivates us to mitigate the issue of heuristically selected features in anomaly detection, as HTTP messages also consist of characters. In this paper, we propose an anomaly detection technique for HTTP messages by using a convolutional autoencoder (CAE) with character-level binary image transformation. The CAE consists of an encoder and a decoder with CNN structures that are symmetrical to each other. Furthermore, when an image that has been transformed from a message is submitted to the CAE, it tries to produce a similar image. Toward this end, the CAE is trained to minimize the binary cross entropy (BCE) between the input and output images for normal messages. After adequate training, the proposed scheme can detect an anomalous message if its BCE is larger than a prespecified threshold value. Experimental results show that the proposed scheme outperforms conventional machine learning schemes, such as a one-class support vector machine and an isolation forest, which use heuristically selected input features. In addition, it is shown that improved performance can be achieved by using a deeper CAE structure and a new decision variable, namely binary cross varentropy, instead of BCE. Finally, to investigate the validity of the characterlevel image transformation, we employ a character embedding in the image transformation, which requires additional computational load but achieves negligible performance improvement.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords