IEEE Access (Jan 2021)

A Novel Imbalanced Ensemble Learning in Software Defect Predication

  • Jianming Zheng,
  • Xingqi Wang,
  • Dan Wei,
  • Bin Chen,
  • Yanli Shao

DOI
https://doi.org/10.1109/ACCESS.2021.3072682
Journal volume & issue
Vol. 9
pp. 86855 – 86868

Abstract

Read online

With the availability of high-speed Internet and the advent of Internet of Things devices, modern software systems are growing in both size and complexity. Software defect prediction (SDP) guarantees the high quality of such complex systems. However, the characteristics of imbalanced distribution of defect data sets have led to the deviation and loss of accuracy of most software defect prediction methods. This paper presents two novel approaches for learning from imbalanced data sets to produce a higher predictive accuracy over the minority class. These two methods differ in whether the oversampling and the misclassification cost information are utilized during training stage, and they are good at different aspects of imbalanced classification, one for dealing with highly imbalanced data sets and the other for moderately imbalanced data sets. Comparing with other state-of-the-art imbalance learning algorithms on imbalanced datasets, the experimental results show that these two methods have achieved excellent results in terms of G-mean and AUC measures, and more accurately identified the defective modules to reduce the cost of detection system.

Keywords