ConNet: Deep Semi-Supervised Anomaly Detection Based on Sparse Positive Samples

Feng Gao; Jing Li; Ruiying Cheng; Yi Zhou; Ying Ye

doi:10.1109/ACCESS.2021.3077014

IEEE Access (Jan 2021)

ConNet: Deep Semi-Supervised Anomaly Detection Based on Sparse Positive Samples

Feng Gao,
Jing Li,
Ruiying Cheng,
Yi Zhou,
Ying Ye

Affiliations

Feng Gao: ORCiD; College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Jing Li: College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Ruiying Cheng: State Grid Information and Telecommunication Branch, Beijing, China
Yi Zhou: State Grid Information and Telecommunication Branch, Beijing, China
Ying Ye: Nari Group Corporation/State Grid Electric Power Research Institute, Nanjing, China

DOI: https://doi.org/10.1109/ACCESS.2021.3077014
Journal volume & issue: Vol. 9
pp. 67249 – 67258

Abstract

Read online

Existing semi-supervised anomaly detection methods usually use a large amount of labeled normal data for training, which have the problem of high labeling costs. Only a few semi-supervised methods utilize unlabeled data and a few labeled anomalies to train models. However, these kinds of methods usually encounter two problems: (i) since anomalies usually have different behavior patterns or the internal mechanisms that produce anomalies are complex and diverse, a few labeled anomalies cannot cover all anomaly types; and (ii) the amount of unlabeled data in the training set is substantially greater than the amount of labeled data, which leads to that unlabeled data with contamination often dominates the training process. To solve these two problems, we propose the semi-supervised anomaly detection method named ConNet and a new loss function named concentration loss. Specifically, ConNet consists of two stages. Firstly, we obtain the prior anomaly score of unlabeled data via prior estimation module and attach the prior anomaly score to unlabeled data as the training weight. Then, an anomaly scoring network is training to assign anomaly scores to data instances, which can ensure that the anomaly scores of anomalies significantly deviate from those of normal instances. We have conducted experiments on thirteen real-world data sets and tested the performance of our method in terms of detection accuracy, utilization efficiency of labeled data, and robustness to different contamination rates. The experimental results show that the performance of our method is significantly better than those of the state-of-the-art anomaly detection methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords