Cybersecurity (Aug 2024)

ProcSAGE: an efficient host threat detection method based on graph representation learning

  • Boyuan Xu,
  • Yiru Gong,
  • Xiaoyu Geng,
  • Yun Li,
  • Cong Dong,
  • Song Liu,
  • Yuling Liu,
  • Bo Jiang,
  • Zhigang Lu

DOI
https://doi.org/10.1186/s42400-024-00240-w
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Advanced Persistent Threats (APTs) achieves internal networks penetration through multiple methods, making it difficult to detect attack clues solely through boundary defense measures. To address this challenge, some research has proposed threat detection methods based on provenance graphs, which leverage entity relationships such as processes, files, and sockets found in host audit logs. However, these methods are generally inefficient, especially when faced with massive audit logs and the computational resource-intensive nature of graph algorithms. Effectively and economically extracting APT attack clues from massive system audit logs remains a significant challenge. To tackle this problem, this paper introduces the ProcSAGE method, which detects threats based on abnormal behavior patterns, offering high accuracy, low cost, and independence from expert knowledge. ProcSAGE focuses on processes or threads in host audit logs during the graph construction phase to effectively control the scale of provenance graphs and reduce performance overhead. Additionally, in the feature extraction phase, ProcSAGE considers information about the processes or threads themselves and their neighboring nodes to accurately characterize them and enhance model accuracy. In order to verify the effectiveness of the ProcSAGE method, this study conducted a comprehensive evaluation on the StreamSpot dataset. The experimental results show that the ProcSAGE method can significantly reduce the time and memory consumption in the threat detection process while improving the accuracy, and the optimization effect becomes more significant as the data size expands.

Keywords