Sensors (Oct 2021)

Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification

  • Leehter Yao,
  • Tung-Bin Lin

DOI
https://doi.org/10.3390/s21196616
Journal volume & issue
Vol. 21, no. 19
p. 6616

Abstract

Read online

The number of sensing data are often imbalanced across data classes, for which oversampling on the minority class is an effective remedy. In this paper, an effective oversampling method called evolutionary Mahalanobis distance oversampling (EMDO) is proposed for multi-class imbalanced data classification. EMDO utilizes a set of ellipsoids to approximate the decision regions of the minority class. Furthermore, multi-objective particle swarm optimization (MOPSO) is integrated with the Gustafson–Kessel algorithm in EMDO to learn the size, center, and orientation of every ellipsoid. Synthetic minority samples are generated based on Mahalanobis distance within every ellipsoid. The number of synthetic minority samples generated by EMDO in every ellipsoid is determined based on the density of minority samples in every ellipsoid. The results of computer simulations conducted herein indicate that EMDO outperforms most of the widely used oversampling schemes.

Keywords