Software Defect Prediction Using an Intelligent Ensemble-Based Model

Misbah Ali; Tehseen Mazhar; Yasir Arif; Shaha Al-Otaibi; Yazeed Yasin Ghadi; Tariq Shahzad; Muhammad Amir Khan; Habib Hamam

doi:10.1109/ACCESS.2024.3358201

IEEE Access (Jan 2024)

Software Defect Prediction Using an Intelligent Ensemble-Based Model

Misbah Ali,
Tehseen Mazhar,
Yasir Arif,
Shaha Al-Otaibi,
Yazeed Yasin Ghadi,
Tariq Shahzad,
Muhammad Amir Khan,
Habib Hamam

Affiliations

Misbah Ali: ORCiD; Department of Computer Science, Virtual University of Pakistan, Lahore, Pakistan
Tehseen Mazhar: ORCiD; Department of Computer Science, Virtual University of Pakistan, Lahore, Pakistan
Yasir Arif: Department of Computer Science, Global Institute, Lahore, Pakistan
Shaha Al-Otaibi: ORCiD; Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh, Saudi Arabia
Yazeed Yasin Ghadi: ORCiD; Department of Computer Science and Software Engineering, Al Ain University, Abu Dhabi, United Arab Emirates
Tariq Shahzad: ORCiD; Department of Electrical and Electronic Engineering Science, School of Electrical Engineering, University of Johannesburg, Johannesburg, South Africa
Muhammad Amir Khan: School of Computing Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia
Habib Hamam: ORCiD; Department of Electrical and Electronic Engineering Science, School of Electrical Engineering, University of Johannesburg, Johannesburg, South Africa

DOI: https://doi.org/10.1109/ACCESS.2024.3358201
Journal volume & issue: Vol. 12
pp. 20376 – 20395

Abstract

Read online

Software defect prediction plays a crucial role in enhancing software quality while achieving cost savings in testing. Its primary objective is to identify and send only defective modules to the testing stage. This research introduces an intelligent ensemble-based software defect prediction model that combines diverse classifiers. The proposed model employs a two-stage prediction process to detect defective modules. In the first stage, four supervised machine learning algorithms are employed: Random Forest, Support Vector Machine, Naïve Bayes, and Artificial Neural Network. These algorithms are optimized through iterative parameter optimization to achieve the highest accuracy possible. In the second stage, the predictive accuracy of the individual classifiers is integrated into a voting ensemble to make the final predictions. This ensemble approach further improves the accuracy and reliability of the defect predictions. Seven historical defect datasets from the NASA MDP repository, namely CM1, JM1, MC2, MW1, PC1, PC3, and PC4, were utilized to implement and evaluate the proposed defect prediction system. The results demonstrate that each dataset’s proposed intelligent system achieved remarkable accuracy, outperforming twenty state-of-the-art defect prediction techniques, including base classifiers and ensemble methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords