IEEE Access (Jan 2023)

Pre-Trained Model-Based NFR Classification: Overcoming Limited Data Challenges

  • Kiramat Rahman,
  • Anwar Ghani,
  • Abdulrahman Alzahrani,
  • Muhammad Usman Tariq,
  • Arif Ur Rahman

DOI
https://doi.org/10.1109/ACCESS.2023.3301725
Journal volume & issue
Vol. 11
pp. 81787 – 81802

Abstract

Read online

Machine learning techniques have shown promising results in classifying non-functional requirements (NFR). However, the lack of annotated training data in the domain of requirement engineering poses challenges to the accuracy, generalization, and reliability of ML-based methods, including overfitting, poor performance, biased models, and out-of-vocabulary issues. This study presents an approach for the classification of NFR in software requirements specification documents by extracting features from word embedding pre-trained models. The novel algorithms are specifically designed to extract relevant representative features from pre-trained word embedding models. In addition, each pre-trained model is paired with the four tailored neural network architectures for NFR classification including RPCNN, RPBiLSTM, RPLSTM, and RPANN. This combination results in the creation of twelve unique models, each with its unique configuration and characteristics. The results show that the integration of pre-trained GloVe models with RPBiLSTM demonstrates the highest performance, achieving an impressive average Area Under the Curve (AUC) score of 96%, a precision of 85%, and recall of 82%, highlighting its strong ability to accurately classify NFRs. Furthermore, among the integration of pre-trained Word2Vec models, RPLSTM achieved notable results, with an AUC score of 95%, precision of 86%, and recall of 80%. Similarly, integrated fastText-based pre-trained models the RPBiLSTM yield competitive performance, with an AUC score of 95%, precision of 85%, and recall of 80%. This comprehensive and integrated approach provides a practical solution for effectively analyzing and classifying NFR, thereby facilitating improved software development practices.

Keywords