Engineering, Technology & Applied Science Research (Jun 2024)

Multi-Class Imbalanced Data Classification: A Systematic Mapping Study

  • Yujiang Wang,
  • Marshima Mohd Rosli,
  • Norzilah Musa,
  • Feng Li

DOI
https://doi.org/10.48084/etasr.7206
Journal volume & issue
Vol. 14, no. 3

Abstract

Read online

Multi-class data classification is distinguished as a significant and challenging research topic in contemporary machine learning, particularly when concerning imbalanced data sets. Hence, a thorough investigation of multi-class imbalanced data classification is becoming increasingly pertinent. In this paper, an overview of multi-class imbalanced data classification was generated via conducting a systematic mapping study, which endeavors to analyze the state of contemporary multi-class imbalanced data classification, with the primary goal of ascertaining the corpus of research undertaken in machine learning. To achieve this aim, 7,164 papers were assessed and the 147 prominent ones were selected from five digital libraries, which were further categorized according to techniques, issues, and types of datasets. After a thorough review of these papers, a taxonomy of multi-class imbalanced data classification techniques is proposed. Based on the results, researchers widely employ algorithmic-level, ensemble, and oversampling strategies to address the issue of multi-class imbalance in medical datasets, primarily to mitigate the impact of challenging data factors. This research highlights an urgent need for more studies on multi-class imbalanced data classification.

Keywords