e-Informatica Software Engineering Journal (Feb 2023)
The Effect of Dual Hyperparameter Optimization on Software Vulnerability Prediction Models
Abstract
Background: Prediction of software vulnerabilities is a major concern in the field of software security. Many researchers have worked to construct various software vulnerability prediction (SVP) models. The emerging machine learning domain aids in building effective SVP models. The employment of data balancing/resampling techniques and optimal hyperparameters can upgrade their performance. Previous research studies have shown the impact of hyperparameter optimization (HPO) on machine learning algorithms and data balancing techniques. Aim: The current study aims to analyze the impact of dual hyperparameter optimization on metrics-based SVP models. Method: This paper has proposed the methodology using the python framework Optuna that optimizes the hyperparameters for both machine learners and data balancing techniques. For the experimentation purpose, we have compared six combinations of five machine learners and five resampling techniques considering default parameters and optimized hyperparameters. Results: Additionally, the Wilcoxon signed-rank test with the Bonferroni correction method was implied, and observed that dual HPO performs better than HPO on learners and HPO on data balancers. Furthermore, the paper has assessed the impact of data complexity measures and concludes that HPO does not improve the performance of those datasets that exhibit high overlap. Conclusion: The experimental analysis unveils that dual HPO is 64% effective in enhancing the productivity of SVP models.
Keywords