IEEE Access (Jan 2022)

Imbalanced Classification via Feature Dictionary-Based Minority Oversampling

  • Minho Park,
  • Hwa Jeon Song,
  • Dong-Oh Kang

DOI
https://doi.org/10.1109/ACCESS.2022.3161510
Journal volume & issue
Vol. 10
pp. 34236 – 34245

Abstract

Read online

Image classification research is one of the fields continuously studied in the computer vision domain, and several related studies have been actively conducted until recently. However, a limit exists regarding the prediction performance of real-world datasets due to the data imbalance problem between classes. Data augmentation through artificial sample generation for minority classes is one of the methods used to overcome this limitation. Among the various oversampling methods, we propose the feature dictionary-based generative model for the oversampling method. Feature dictionaries are built through the pretrained feature extractor, and the proposed generative model synthesizes artificial samples based on the dictionary. Class-to-class balanced training can be conducted by fine-tuning the classifier as additional data for the minority class. We experiment by applying the proposed framework to the fashion dataset, which has an extreme class imbalance. The experimental results demonstrate that the proposed model achieved the highest top-1 performance on various public fashion datasets. In addition, we analyze the number of samples in the dictionary and test the effectiveness of the elements that comprise the proposed model using various ablation studies.

Keywords