Enhancing Bug Report Summaries Through Knowledge-Specific and Contrastive Learning Pre-Training

Yunna Shao; Bangmeng Xiang

doi:10.1109/ACCESS.2024.3368915

IEEE Access (Jan 2024)

Enhancing Bug Report Summaries Through Knowledge-Specific and Contrastive Learning Pre-Training

Yunna Shao,
Bangmeng Xiang

Affiliations

Yunna Shao: Zhejiang College of Security Technology, Wenzhou, China
Bangmeng Xiang: ORCiD; Zhejiang College of Security Technology, Wenzhou, China

DOI: https://doi.org/10.1109/ACCESS.2024.3368915
Journal volume & issue: Vol. 12
pp. 37653 – 37662

Abstract

Read online

Bug reports are crucial in software maintenance, with concise summaries significantly enhancing the efficiency of bug triagers and ultimately contributing to the development of high-quality software products. Contemporary methods for automatic bug report summarization primarily utilize neural networks’ robust learning capabilities. However, these approaches often produce suboptimal summaries due to two primary limitations: 1) the difficulty in assimilating the domain-specific knowledge inherent in bug reports, and 2) the limitations of purely supervised learning in comprehending the comprehensive context of bug reports. To address the above two problems, in this paper, we propose a new approach for bug report summarization, namely KSCLP, which leverages large language models and domain-specific pre-training strategies, i.e., Knowledge-Specific and Contrastive Learning Pre-training. Specifically, the Knowledge-Specific strategy allows to pre-train KSCLP on project-specific bug reports corpus, by which the model can fully learn internal knowledge of bug reports, learning bug report-aware representation. As for the Contrastive Learning strategy, it performs a sequence-level pre-training for KSCLP, helping it capture the semantic information of bug reports on a global level. Upon completion of the pre-training phase, KSCLP undergoes further refinement through a Sequence-to-Sequence framework specifically tailored for bug report summarization. The efficacy of KSCLP is rigorously evaluated against five baseline models using a publicly available dataset. The empirical results demonstrate that KSCLP outperforms all baselines, achieving remarkable improvements by up to 23.73, 13.97, and 20.89 points in ROUGE-1, ROUGE-2, and ROUGE-L metrics, thereby setting new benchmarks in the field of bug report summarization.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords