Applied Sciences (Mar 2023)

Predicting Software Cohesion Metrics with Machine Learning Techniques

  • Elif Nur Haner Kırğıl,
  • Tülin Erçelebi Ayyıldız

DOI
https://doi.org/10.3390/app13063722
Journal volume & issue
Vol. 13, no. 6
p. 3722

Abstract

Read online

The cohesion value is one of the important factors used to evaluate software maintainability. However, measuring the cohesion value is a relatively difficult issue when tracing the source code manually. Although there are many static code analysis tools, not every tool measures every metric. The user should apply different tools for different metrics. In this study, besides the use of these tools, we predicted the cohesion values (LCOM2, TCC, LCC, and LSCC) with machine learning techniques (KNN, REPTree, multi-layer perceptron, linear regression (LR), support vector machine, and random forest (RF)) to solve them alternatively. We created two datasets utilizing two different open-source software projects. According to the obtained results, for the LCOM2 and TCC metrics, the KNN algorithm provided the best results, and for LCC and LSCC metrics, the REPTree algorithm was the best. However, out of all the metrics, RF, REPTree, and KNN had close performances with each other, and therefore any of the RF, REPTree, and KNN techniques can be used for software cohesion metric prediction.

Keywords