Jisuanji kexue yu tansuo (Dec 2023)

Impact of Hyperparameter Optimization on Cross-Version Defect Prediction: An Empirical Study

  • HAN Hui, YU Qiao, ZHU Yi

DOI
https://doi.org/10.3778/j.issn.1673-9418.2209087
Journal volume & issue
Vol. 17, no. 12
pp. 3052 – 3064

Abstract

Read online

In the field of machine learning, hyperparameters are one of the key factors that affect prediction performance. Previous studies have shown that optimizing hyperparameters can improve the performance of inner-version defect prediction and cross-project defect prediction, but the impact on the performance of cross-version defect prediction is unclear. This paper chooses five classical defect prediction models (decision tree, K-nearest neighbors, random forests, support vector machine, and multi-layer perceptron) and four common hyperparameter optimization algorithms (Bayesian optimization based on TPE, Bayesian optimization based on SMAC, random search, and simulated annealing). An empirical study is conducted on PROMISE dataset to explore the influence of optimizing hyperparameters on the performance of cross-version defect prediction. The results indicate that: firstly, there is an obvious improvement in the AUC of cross-version defect prediction after optimizing the decision tree, K-nearest neighbors and multi-layer perceptron models; secondly, the optimal models still maintain the same stability as the default hyperparametric models; thirdly, hyperparameter optimization takes 1 to 2 minutes for all models on average except the complicated multi-layer perceptron model and it is feasible to optimize the hyperparameter of model in cross-version defect prediction experiment. The above results indicate that the hyperparameter optimization of the model should be considered in the process of cross-version defect prediction to improve its performance.

Keywords