Applied Sciences (Aug 2023)

SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network

  • Bangrui Wan,
  • Shuang Dong,
  • Jianjun Zhou,
  • Ying Qian

DOI
https://doi.org/10.3390/app13179580
Journal volume & issue
Vol. 13, no. 17
p. 9580

Abstract

Read online

Code clone detection is an important research topic in the field of software engineering. It is significant in developing software and solving software infringement disputes to discover code clone phenomenon effectively in and between software systems. In practical engineering applications, clone detection can usually only be performed on the compiled code due to the unavailability of the source code. Additionally, there is room for improvement in the detection effect of existing methods based on bytecode. Based on the above reasons, this paper proposes a novel code clone detection method for Java bytecode: SJBCD. SJBCD extracts opcode sequences from byte code files, use GloVe to vectorize opcodes, and builds a Siamese neural network based on GRU to perform supervised training. Then the trained network is used to detect code clones. In order to prove the effectiveness of SJBCD, this paper conducts validation experiments using the BigCloneBench dataset and provides a comparative analysis with four other methods. Experimental results show the effectiveness of the SJBCD method.

Keywords