IEEE Access (Jan 2024)

EnsCL-CatBoost: A Strategic Framework for Software Requirements Classification

  • Jalil Abbas,
  • Cheng Zhang,
  • Bin Luo

DOI
https://doi.org/10.1109/ACCESS.2024.3452011
Journal volume & issue
Vol. 12
pp. 127614 – 127628

Abstract

Read online

Accurate classification of software requirements, distinguishing between functional and non-functional aspects, is crucial for developing reliable and efficient software systems. However, existing methods often struggle with insufficient semantic understanding and managing diverse software requirements. In this study, we introduce an innovative framework named EnsCL-CatBoost (Ensembled Contrastive Learning with CatBoost) to address these challenges by enhancing classification accuracy and robustness. Our method uses a weighted ensemble of Doc2Vec, Word2Vec, and FastText, leveraging their strengths for richer semantic representation. Unlike conventional strategies, we incorporate contrastive learning with the InfoNCE loss function, boosting discriminative power by clustering similar samples and distancing dissimilar ones, thus enhancing robustness and model generalization. To tackle class imbalance, we integrate SMOTE-Tomek links into the embedding process, achieving balanced class distribution before classification. Additionally, our evaluation extends beyond test datasets to new, unlabeled datasets, demonstrating practical applicability in real-world scenarios. We compare our framework’s performance with five machine learning classifiers which we trained using traditional embedding techniques. The results show that our method significantly outperforms others, achieving 94% accuracy. Our research transparently presents tool-based results, highlighting the transformative potential of automation in software requirement classification and setting a new benchmark for practical deployment in diverse environments, paving the way for future research in the field.

Keywords