Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Current Trends in the Search for Similarities in Source Codes with an Application in the Field of Plagiarism and Clone Detection

  • Patrik Hrkut,
  • Michal Ďuračík,
  • Štefan Toth,
  • Matej Meško

DOI
https://doi.org/10.23919/FRUCT58615.2023.10143064
Journal volume & issue
Vol. 33, no. 1
pp. 77 – 84

Abstract

Read online

There are many methods and approaches for determining the similarity between two source codes. Many of them were inspired by developments in the field of NLP (Natural Language Processing) since the source text can be considered a special type of text. These methods have been implemented in many software tools and surprisingly, many of them are still in use (MOSS, JPlag). Artificial intelligence brought new procedures in the area of NLP, and they were also applied to the area of source code analysis. The article provides an overview of the methods for similarity detection in the source code, which we have not yet found in the literature to such an extent. Although it is certainly not exhaustive, it provides an overview of approaches from the oldest to those that are only beginning to gain attention at the present time.

Keywords