IEEE Access (Jan 2020)

Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion

  • Junhai Zhai,
  • Jiaxing Qi,
  • Sufang Zhang

DOI
https://doi.org/10.1109/ACCESS.2020.3023949
Journal volume & issue
Vol. 8
pp. 169456 – 169469

Abstract

Read online

Binary imbalance problem refers to such a classification scenario where one class contains a large number of samples while another class contains only a few samples. When traditional classifiers face with imbalanced datasets, they usually bias towards majority class resulting in poor classification performance. Oversampling is an effective method to address this problem, yet how to conduct diversity oversampling is a challenge. In this article, we proposed a diversity oversampling method based on a modified D2GAN model, and on the basis of diversity oversampling, we also proposed a binary imbalanced data classification approach based on classifier fusion by fuzzy integral. Extensive experiments are conducted on 8 data sets to compare the proposed methods with 7 state-of-the-art methods on 5 aspects: MMD-score, Silhouette-score, F-measure, G-means, and AUC-area. The 7 methods include 4 SMOTE related approaches and 3 GAN related approaches. The experimental results demonstrate that the proposed methods are more effective and efficient than the compared approaches.

Keywords