Applying Convolutional Neural Networks With Different Word Representation Techniques to Recommend Bug Fixers

Syed Farhan Alam Zaidi; Faraz Malik Awan; Minsoo Lee; Honguk Woo; Chan-Gun Lee

doi:10.1109/ACCESS.2020.3040065

IEEE Access (Jan 2020)

Applying Convolutional Neural Networks With Different Word Representation Techniques to Recommend Bug Fixers

Syed Farhan Alam Zaidi,
Faraz Malik Awan,
Minsoo Lee,
Honguk Woo,
Chan-Gun Lee

Affiliations

Syed Farhan Alam Zaidi: ORCiD; CAU Institute of Innovative Talent of Big Data, Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Faraz Malik Awan: ORCiD; Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Minsoo Lee: CAU Institute of Innovative Talent of Big Data, Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Honguk Woo: ORCiD; Department of Software, Sungkyunkwan University, Suwon, South Korea
Chan-Gun Lee: ORCiD; Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2020.3040065
Journal volume & issue: Vol. 8
pp. 213729 – 213747

Abstract

Read online

Bug triage processes are intended to assign bug reports to appropriate developers effectively, but they typically become bottlenecks in the development process-especially for large-scale software projects. Recently, several machine learning approaches, including deep learning-based approaches, have been proposed to recommend an appropriate developer automatically by learning past assignment patterns. In this paper, we propose a deep learning-based bug triage technique using a convolutional neural network (CNN) with three different word representation techniques: Word to Vector (Word2Vec), Global Vector (GloVe), and Embeddings from Language Models (ELMo). Experiments were performed on datasets from well-known large-scale open-source projects, such as Eclipse and Mozilla, and top-k accuracy was measured as an evaluation metric. The experimental results suggest that the ELMo-based CNN approach performs best for the bug triage problem. GloVe-based CNN slightly outperforms Word2Vec-based CNN in many cases. Word2Vec-based CNN outperforms GloVe-based CNN when the number of samples per class in the dataset is high enough.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords