e-Informatica Software Engineering Journal (Aug 2017)

NRFixer: Sentiment Based Model for Predicting the Fixability of Non-Reproducible Bugs

  • Anjali Goyal,
  • Neetu Sardana

DOI
https://doi.org/10.5277/e-Inf170105
Journal volume & issue
Vol. 11, no. 1
pp. 109 – 122

Abstract

Read online

Software maintenance is an essential step in software development life cycle. Nowadays, software companies spend approximately 45\% of total cost in maintenance activities. Large software projects maintain bug repositories to collect, organize and resolve bug reports. Sometimes it is difficult to reproduce the reported bug with the information present in a bug report and thus this bug is marked with resolution non-reproducible (NR). When NR bugs are reconsidered, a few of them might get fixed (NR-to-fix) leaving the others with the same resolution (NR). To analyse the behaviour of developers towards NR-to-fix and NR bugs, the sentiment analysis of NR bug report textual contents has been conducted. The sentiment analysis of bug reports shows that NR bugs' sentiments incline towards more negativity than reproducible bugs. Also, there is a noticeable opinion drift found in the sentiments of NR-to-fix bug reports. Observations driven from this analysis were an inspiration to develop a model that can judge the fixability of NR bugs. Thus a framework, {NRFixer,} which predicts the probability of NR bug fixation, is proposed. {NRFixer} was evaluated with two dimensions. The first dimension considers meta-fields of bug reports (model-1) and the other dimension additionally incorporates the sentiments (model-2) of developers for prediction. Both models were compared using various machine learning classifiers (Zero-R, naive Bayes, J48, random tree and random forest). The bug reports of Firefox and Eclipse projects were used to test {NRFixer}. In Firefox and Eclipse projects, J48 and Naive Bayes classifiers achieve the best prediction accuracy, respectively. It was observed that the inclusion of sentiments in the prediction model shows a rise in the prediction accuracy ranging from 2 to 5\% for various classifiers.

Keywords