Journal of King Saud University: Computer and Information Sciences (Feb 2024)

FCNN: Simple neural networks for complex code tasks

  • Xuekai Sun,
  • Tieming Liu,
  • Chunling Liu,
  • Weiyu Dong

Journal volume & issue
Vol. 36, no. 2
p. 101970

Abstract

Read online

Program analysis using deep learning has become a focus of research, and representing code as model input is a major challenge. While abstract syntax trees (ASTs) have proven effective, using them directly as model input introduces issues of long-term dependency. Approaches based on AST paths objectively alleviate the vanishing gradient problem caused by large-scale syntax trees, but still face limitations. This paper introduces a novel approach to code analysis using fully connected neural networks (FCNN). We propose a new context structure that cleverly preserves the up-down relationships between path nodes. Simultaneously, it employs an encoding method based on node embeddings, mitigating model sparsity. Despite its simplicity, FCNN demonstrates versatility in handling various tasks associated with code analysis. Our work underscores the importance of improvements and innovations in code representation compared to using more advanced and complex deep learning models in the field of code analysis. Evaluation on two common code analysis tasks, namely code classification and code similarity detection, validates the effectiveness of the proposed approach. In code classification, FCNN achieves an F1 score of 0.45, surpassing all comparison baselines. In code similarity detection, FCNN attains an F1 score of 0.91, outperforming RtvNN and CDLH by 35.82% and 10.98%, respectively.

Keywords