Frontiers in Computer Science (Dec 2024)

CodeGuard: enhancing accuracy in detecting clones within java source code

  • Yasir Glani,
  • Luo Ping

DOI
https://doi.org/10.3389/fcomp.2024.1455860
Journal volume & issue
Vol. 6

Abstract

Read online

Detecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique that employs comprehensive level-by-level abstraction for Type-II clones and a flexible signature matching algorithm for Type-III clone categories. This method requires at least 50% similarity within two corresponding chunks within the same file, ensuring accurate clone identification. Unlike recently proposed methods limited to clone detection, CodeGuard precisely pinpoints changes within clone files, facilitating effective debugging and thorough code analysis. It is validated through comprehensive evaluations using reputable datasets, CodeGuard demonstrates superior precision, high recall, robust F1 scores, and outstanding accuracy. This innovative methodology not only sets new performance standards in clone detection but also emphasizes the role CodeGuard's can play in modern software development, paving the way for advancements in code quality and maintenance.

Keywords