Jisuanji kexue yu tansuo (Sep 2022)

Overview of Deep Learning-Based Code Representation and Its Applications

  • ZHANG Xiangping, LIU Jianxun

DOI
https://doi.org/10.3778/j.issn.1673-9418.2110073
Journal volume & issue
Vol. 16, no. 9
pp. 2011 – 2029

Abstract

Read online

The analysis and inference of program play an important role in software development, maintenance and migration. How to efficiently obtain high quality information from program code has become a hot research topic. In recent years, a large number of researchers have introduced the deep learning-based representation technology into the code analysis tasks. The deep learning model can automatically extract the implicit and useful features implicit in the source code, which can alleviate the dependence on the manual construct feature. This paper first introduces the background and basic concepts of code representation, and summarizes the recent research work on deep learning-based code representation learning from the perspective of code static information analysis. Furthermore, this paper introduces the application of code representation on three tasks, code clone detection, code search and code completion. Finally, it discusses the challenges of deep learning-based code representation and the possible research directions in this field.

Keywords