Journal of Big Data (Apr 2025)

Graph neural network approach with spatial structure to anomaly detection of network data

  • Hao Zhang,
  • Yun Zhou,
  • Huahu Xu,
  • Jiangang Shi,
  • Xinhua Lin,
  • Yiqin Gao

DOI
https://doi.org/10.1186/s40537-025-01149-y
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 27

Abstract

Read online

Abstract Network anomaly detection using graph-structured data is a critical task in data mining and cybersecurity, involving the identification of unusual patterns within a network by analyzing its structure as a graph. However, network data often exhibit high dimensionality and sparsity, complicating the detection of meaningful rarity anomalous patterns. Accurate modeling the distance between the nodes in spatial structure is particularly challenging. Additionally, the scarcity of labelled anomaly data for training supervised models can hinder the accuracy and effectiveness of anomaly detection methods. To address these challenges, we propose a novel method that enhances distance feature extraction by embedding graph data into hyperbolic space, which naturally captures hierarchical and relational structures in graphs. This method has been validated both mathematically and experimentally. Specifically, a gain factor derived from commonality metrics is introduced, adhering to conformal properties and preserving relative distances in space. The gain factor enhances the precision of edge weight features used for graph construction, ensuring that relative distances between points are accurately preserved in the embedding space. Data augmentation techniques are employed to address the issue of the scarcity of labelled data. Results demonstrate that optimizing edge weights yields greater improvements in anomaly detection performance compared to optimizing node attributes, achieving twice the contribution. Optimizing edge weights is more effective in capturing interaction anomalies between nodes, whereas focusing solely on node attributes may overlook subtle irregularities at the relational level. Furthermore, findings suggest that the proposed approach is suitable for complex anomaly detection environments due to its robustness and scalability.

Keywords