GNEM: Comprehensive Similarity Learning With Ensemble Model for Code Search

Juntong Hong; Eunjong Choi; Osamu Mizuno

doi:10.1109/ACCESS.2024.3493193

IEEE Access (Jan 2024)

GNEM: Comprehensive Similarity Learning With Ensemble Model for Code Search

Juntong Hong,
Eunjong Choi,
Osamu Mizuno

Affiliations

Juntong Hong: ORCiD; Graduate School of Science and Technology, Kyoto Institute of Technology, Kyoto, Japan
Eunjong Choi: ORCiD; Faculty of Information and Human Sciences, Kyoto Institute of Technology, Kyoto, Japan
Osamu Mizuno: ORCiD; Faculty of Information and Human Sciences, Kyoto Institute of Technology, Kyoto, Japan

DOI: https://doi.org/10.1109/ACCESS.2024.3493193
Journal volume & issue: Vol. 12
pp. 165005 – 165016

Abstract

Read online

Code search is a relevant research field of software engineering, with the objective of accurately retrieving the most relevant code for a given query. However, recent deep-learning-based code search models are limited in scalability and comprehensiveness for alignment learning since these models suffer from the out-of-vocabulary problem, and affinity matrix-based cross-modal attention may lead to incorrect alignments. In this paper, we propose a novel code search model, namely the Graph Network Ensemble Model (GNEM), to address the challenges by diverse learning alignments and enhancing the similarity representation. GNEM incorporates two graph networks to learn global and fine-grained alignments for inferring comprehensive similarity. To evaluate the performance of GNEM, we compared it with baseline models using two widely used datasets. The results demonstrate that GNEM achieves a Top@1 accuracy of 0.649 and 0.702, surpassing baseline models by approximately 18.6% and 11.7% in Top@1 accuracy, respectively. We also conducted ablation experiments to show the effectiveness of each component of GNEM. Finally, we visualize the attention weights between code and query to illustrate GNEM’s behaviors while code searching. The results provide insights into GNEM’s effective code search capabilities.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords