MATEC Web of Conferences (Jan 2016)

An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE

  • Zhang Chenggang,
  • Song Jiazhi,
  • Pei Zhili,
  • Jiang Jingqing

DOI
https://doi.org/10.1051/matecconf/20165601014
Journal volume & issue
Vol. 56
p. 01014

Abstract

Read online

Imbalanced data classification problem has always been one of the hot issues in the field of machine learning. Synthetic minority over-sampling technique (SMOTE) is a classical approach to balance datasets, but it may give rise to such problem as noise. Stacked De-noising Auto-Encoder neural network (SDAE), can effectively reduce data redundancy and noise through unsupervised layer-wise greedy learning. Aiming at the shortcomings of SMOTE algorithm when synthesizing new minority class samples, the paper proposed a Stacked De-noising Auto-Encoder neural network algorithm based on SMOTE, SMOTE-SDAE, which is aimed to deal with imbalanced data classification. The proposed algorithm is not only able to synthesize new minority class samples, but it also can de-noise and classify the sampled data. Experimental results show that compared with traditional algorithms, SMOTE-SDAE significantly improves the minority class classification accuracy of the imbalanced datasets.