Improving Bug Localization With Effective Contrastive Learning Representation

Zhengmao Luo; Wenyao Wang; Caichun Cen

doi:10.1109/ACCESS.2022.3228802

IEEE Access (Jan 2023)

Improving Bug Localization With Effective Contrastive Learning Representation

Zhengmao Luo,
Wenyao Wang,
Caichun Cen

Affiliations

Zhengmao Luo: ORCiD; Zhejiang College of Security Technology, Wenzhou, Zhejiang, China
Wenyao Wang: Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
Caichun Cen: ORCiD; Faculty of Humanities and Arts, Macau University of Science and Technology, Macau, China

DOI: https://doi.org/10.1109/ACCESS.2022.3228802
Journal volume & issue: Vol. 11
pp. 32523 – 32533

Abstract

Read online

Automated localization of buggy files can accelerate developers’ efficiency of software maintenance, improving the quality of software products. State-of-the-art approaches for bug localization is based on neural networks, e.g., RNN or CNN, and can learn semantic feature from the given bug report. However, these simple neural architectures are difficult to learn the deep contextual feature from bug reports, which hurts the semantic mapping between bug reports and their corresponding buggy files. To resolve the above problem, in this paper we propose a bug localization approach that combines pre-trained language models and contrastive learning, namely CoLoc. Specifically, CoLoc first is pre-trained on a large-scale bug report corpus in an unsupervised way, to learn the deep contextual feature of each token in the bug report according to its context. Afterward, CoLoc is further pre-trained by a contrastive learning objective to learn the contrastive learning representations both of bug reports and buggy files. Contrastive learning can help CoLoc to learn the semantic differences between different bug reports and buggy files. To evaluate the effectiveness of CoLoc, we choose five baseline approaches and compare their performance on a public dataset. The experimental results show that CoLoc outperforms all baseline approaches by up to 76.00% in terms of MRR, achieving new results for bug localization.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords