Journal of Applied Informatics and Computing (Nov 2024)
Comparison of Oversampling Techniques on Minority Data Using Imbalance Software Defect Prediction Dataset
Abstract
Software Defect Prediction Dataset as a component of the Software Defect Prediction model has a very vital role. However, NASA Software Defect Prediction has a problem with imbalance in minority data. This study compares the performance of oversampling techniques in overcoming this. A total of 90 oversampling techniques in the form of SMOTE and its variants were used. The results of this study indicate that there is no oversampling technique that is able to overcome this. The original dataset without oversampling shows good performance at the level of accuracy and f1-score but has low performance on auc-score and g-score. Several oversampling techniques show increased performance on auc-score and g-score, unfortunately at the same time showing a decrease in performance on accuracy and f1-score.
Keywords