IEEE Access (Jan 2024)

Enhancing Bug Report Summaries Through Knowledge-Specific and Contrastive Learning Pre-Training

  • Yunna Shao,
  • Bangmeng Xiang

DOI
https://doi.org/10.1109/ACCESS.2024.3368915
Journal volume & issue
Vol. 12
pp. 37653 – 37662

Abstract

Read online

Bug reports are crucial in software maintenance, with concise summaries significantly enhancing the efficiency of bug triagers and ultimately contributing to the development of high-quality software products. Contemporary methods for automatic bug report summarization primarily utilize neural networks’ robust learning capabilities. However, these approaches often produce suboptimal summaries due to two primary limitations: 1) the difficulty in assimilating the domain-specific knowledge inherent in bug reports, and 2) the limitations of purely supervised learning in comprehending the comprehensive context of bug reports. To address the above two problems, in this paper, we propose a new approach for bug report summarization, namely KSCLP, which leverages large language models and domain-specific pre-training strategies, i.e., Knowledge-Specific and Contrastive Learning Pre-training. Specifically, the Knowledge-Specific strategy allows to pre-train KSCLP on project-specific bug reports corpus, by which the model can fully learn internal knowledge of bug reports, learning bug report-aware representation. As for the Contrastive Learning strategy, it performs a sequence-level pre-training for KSCLP, helping it capture the semantic information of bug reports on a global level. Upon completion of the pre-training phase, KSCLP undergoes further refinement through a Sequence-to-Sequence framework specifically tailored for bug report summarization. The efficacy of KSCLP is rigorously evaluated against five baseline models using a publicly available dataset. The empirical results demonstrate that KSCLP outperforms all baselines, achieving remarkable improvements by up to 23.73, 13.97, and 20.89 points in ROUGE-1, ROUGE-2, and ROUGE-L metrics, thereby setting new benchmarks in the field of bug report summarization.

Keywords