IEEE Access (Jan 2025)

An Improved Triangulation Oversampling Method for Processing Unbalanced Data

  • Jingjing Liu,
  • Yefeng Liu,
  • Yanwei Ma,
  • Qichun Zhang

DOI
https://doi.org/10.1109/ACCESS.2025.3538867
Journal volume & issue
Vol. 13
pp. 33655 – 33664

Abstract

Read online

In classification tasks, the algorithms heavily depend on big data, yet data category imbalance hinders the model’s ability to adequately learn from scarce class samples, impacting its overall learning capacity. This paper addresses the challenge of data imbalance in data-driven classification problems, which introduce an Improved Finite Elements-Synthetic Minority Oversampling Technique (IFE-SMOTE) for data balancing. This method triangulates the sample space leveraging FE-SMOTE, designates three strategic points within each split element based on defined rules, and synthetically generates samples within the linear vicinity of these points. The mathematical expectation of generated samples aligns with the triangle center of the original minority samples, while their variance closely mirrors that of the triangle center, ensuring statistical consistency with the original data. The corresponding theorem is provided and its validity is proved. Numerical experiments confirm the effectiveness of the proposed IFE-SMOTE method. The IFE-SMOTE algorithm classified eight datasets and compared them with other oversampling algorithms using G-mean, F-measure, and AUC. IFE-SMOTE’s average scores were 0.9273, 0.9754, and 0.9309, respectively, outperforming FE-SMOTE by 0.0268, 0.0059, and 0.0253, and other algorithms’ averages by 0.0493, 0.0169, and 0.0402.

Keywords