IEEE Access (Jan 2024)

The Effectiveness of Hidden Dependence Metrics in Bug Prediction

  • Judit Jasz

DOI
https://doi.org/10.1109/ACCESS.2024.3406929
Journal volume & issue
Vol. 12
pp. 77214 – 77225

Abstract

Read online

Finding and fixing bugs in programs is perhaps one of the most difficult, yet most important, tasks in software maintenance. This is why in the last decades, a lot of work has been done on this topic, most of which is based on machine learning methods. Studies on bug prediction can be found for almost all programming languages. The solutions presented generally try to predict bugs based on information that can be easily extracted from the source code, rather than more expensive solutions that require a deeper understanding of the program. Another feature of these solutions is that they usually try to predict faults at a high level (module/file/class), which is useful, but locating the bug itself is still a difficult task. This work presents a solution that attempts to predict bugs at the method level, while also tracking the dependencies in the program using an efficient algorithm, resulting in an approach that can predict bugs more accurately. The practical measurements show that the defined approach really outperforms predictions based on traditional metrics in most cases, and with proper filtering, the best-performing RandomForest algorithm according to the F-measure can even achieve an improvement of up to 11%. Finally, it is proven that the introduced metrics are even suitable for predicting bugs that will appear later in a given project if sufficient learning data is available.

Keywords