e-Informatica Software Engineering Journal (Aug 2021)

Multi-view learning for software defect prediction

  • Elife Ozturk Kiyak,
  • Derya Birant,
  • Kokten Ulas Birant

DOI
https://doi.org/10.37190/e-Inf210108
Journal volume & issue
Vol. 15, no. 1

Abstract

Read online

Background: Traditionally, machine learning algorithms have been simply applied for software defect prediction by considering single-view data, meaning the input data contains a single feature vector. Nevertheless, different software engineering data sources may include multiple and partially independent information, which makes the standard single-view approaches ineffective. Objective: In order to overcome the single-view limitation in the current studies, this article proposes the usage of a multi-view learning method for software defect classification problems. Method: The Multi-View k-Nearest Neighbors (MVKNN) method was used in the software engineering field. In this method, first, base classifiers are constructed to learn from each view, and then classifiers are combined to create a robust multi-view model. Results: In the experimental studies, our algorithm (MVKNN) is compared with the standard k-nearest neighbors (KNN) algorithm on 50 datasets obtained from different software bug repositories. The experimental results demonstrate that the MVKNN method outperformed KNN on most of the datasets in terms of accuracy. The average accuracy values of MVKNN are 86.59%, 88.09%, and 83.10% for the NASA MDP, Softlab, and OSSP datasets, respectively. Conclusion: The results show that using multiple views (MVKNN) can usually improve classification accuracy compared to a single-view strategy (KNN) for software defect prediction.

Keywords