Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem

Ensen Wu; Hongyan Cui; Roy E. Welsch

doi:10.1109/ACCESS.2020.2994327

IEEE Access (Jan 2020)

Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem

Ensen Wu,
Hongyan Cui,
Roy E. Welsch

Affiliations

Ensen Wu: ORCiD; State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
Hongyan Cui: ORCiD; State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
Roy E. Welsch: Sloan School of Business, Massachusetts Institute of Technology, Cambridge, MA, USA

DOI: https://doi.org/10.1109/ACCESS.2020.2994327
Journal volume & issue: Vol. 8
pp. 91265 – 91275

Abstract

Read online

The imbalanced classification problem has become greatest issue in many fields, especially in fraud detection. In fraud detection, the transaction datasets available for training are extremely imbalanced, with fraudulent transaction logs considerably less represented. Meanwhile, the feature information of the fraud samples with better classification capabilities cannot be mined directly by feature learning methods due to too few fraud samples. These significantly reduce the effectiveness of fraud detection models. In this paper, we proposed a Dual Autoencoders Generative Adversarial Network, which can balance the majority and minority classes and learn feature representations of normal and fraudulent transactions to improve the accuracy of the fraud detection. The new model firstly trains a Generative Adversarial Networks to output sufficient mimicked fraudulent transactions for autoencoder training. Then, two autoencoders are trained on the normal and fraud dataset, respectively. The samples are encoded by two autoencoders to obtain two sets of features, which are combined to form the dual autoencoding features. Finally, the model detects fraudulent transactions by a Neural Network trained on the augmented training set. Experimental results show that the model outperforms a set of well-known classification methods in experiments, especially the sensitivity and precision, which are effectively improved.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords