IEEE Access (Jan 2024)

Lightweight Multi-Task Learning Method for System Log Anomaly Detection

  • Tuan-Anh Pham,
  • Jong-Hoon Lee

DOI
https://doi.org/10.1109/ACCESS.2024.3425369
Journal volume & issue
Vol. 12
pp. 147739 – 147752

Abstract

Read online

Log anomaly detection is a crucial task in monitoring IT systems along with metrics and traces. An anomaly could be detected by either one of two types of logs: individual logs or log sequences. While an individual log indicates an independent system status, combining multiple logs describes the execution paths of systems. Once the patterns of log sequences deviate much from normal execution behaviors, that might indicate system anomalies. For log anomaly detection using log sequences, supervised learning methods are preferable due to their high performance. However, these methods require labeled data to train models. As systems evolve, the number of logs increases significantly, which makes labeling data labor-intensive and impractical. Therefore, other learning techniques, such as semi-supervised or unsupervised, are better alternatives for detection. In practice, detecting log anomalies is quite challenging because of several problems, such as unstable logs, new types of logs, and unexplored log semantics. To address these problems and enhance detection performance, we propose a lightweight semi-supervised multi-task learning method named MultiLog in this paper. The key components of the proposed method are pre-trained language model BERT, dimension reduction, attention mechanism from Transformer, and multi-task learning. Similar to previous studies, we conduct comprehensive experiments on three widely used datasets: HDFS, BGL, and Thunderbird. In terms of efficiency, our proposed model is 50 times smaller while the F1-Scores are maintained compared to the original model. In terms of effectiveness, the proposed model outperforms baseline methods and achieves performance comparable to supervised learning models.

Keywords