Complex & Intelligent Systems (Nov 2024)
Towards accurate anomaly detection for cloud system via graph-enhanced contrastive learning
Abstract
Abstract As a critical technology, anomaly detection ensures the smooth operation of cloud systems while maintaining the market competitiveness of cloud service providers. However, the resource data in real-world cloud systems is predominantly unannotated, leading to insufficient supervised signals for anomaly detection. Moreover, complicated topological associations existed between cloud servers (e.g., computation, storage, and communication). While acquiring resource information, correlating the system topology is challenging. To this end, we propose the GCAD for cloud system anomaly detection, which integrates data augmentation, GraphGRU, contrastive learning, and reconstruction. First, GCAD constructs positive and negative sample pairs through the masking and Gaussian noise data augmentation. Then, the GraphGRU processes extended temporal graph data, extracting and fusing spatiotemporal features from resource status and system topology. In addition, GCAD introduces linear attention for encoding spatiotemporal representations to capture their global correlation information. The weight parameters of the encoder are optimized using a contrastive learning mechanism. Finally, GCAD utilizes a reconstruction technique to calculate anomaly scores, facilitating the evaluation of the state of the cloud system at each time point. Experimental results indicate that GCAD outperforms state-of-the-art compared methods on two real-world datasets that contain topology information.
Keywords