IEEE Access (Jan 2021)

Deep Learning-Based Correct Answer Prediction for Developer Forums

  • Hafiz Umar Iftikhar,
  • Aqeel Ur Rehman,
  • Olga A. Kalugina,
  • Qasim Umer,
  • Haris Ali Khan

DOI
https://doi.org/10.1109/ACCESS.2021.3108416
Journal volume & issue
Vol. 9
pp. 128166 – 128177

Abstract

Read online

Developer forums are essential for software engineers to solve their problems with the assistance of experts on such forums. However, sometimes the solutions (answers) of a problem are not satisfactory or challenging to select the potential answer. Information seekers usually browse all the answers within the question thread to get the potential answer. The manual selection of correct answers is a tedious and time-consuming task. In this paper, we propose an automatic classification approach to predict the correct answers for developer forums. We first extract the metadata and combination of Q/A for each thread of the developer community (Stack Overflow). Then, the natural language processing techniques are applied to preprocess the Q/A combinations of the given dataset. After that, a keyword ranking algorithm is leveraged to extract keywords and their ranking scores for each Q/A combination. Based on keywords and their ranking scores for each Q/A combination, a keywords-based feature vector is constructed. Subsequently, word embedding is leveraged to convert each preprocessed Q/A combination into a text-based feature vector. Finally, we pass the metadata, keywords-based features, and text-based features to the ensemble deep learning model for training to predict correct answers. The results of 10-fold cross-validation specify that the proposed approach is accurate and surpasses the state-of-the-art. On average, it improves the accuracy, precision, recall, and f-measure up to 1.72%, 24.96%, 6.57%, and 16.62%, respectively.

Keywords