IEEE Access (Jan 2024)

LCNN: Lightweight CNN Architecture for Software Defect Feature Identification Using Explainable AI

  • Momotaz Begum,
  • Mehedi Hasan Shuvo,
  • Mostofa Kamal Nasir,
  • Amran Hossain,
  • Mohammad Jakir Hossain,
  • Imran Ashraf,
  • Jia Uddin,
  • Md. Abdus Samad

DOI
https://doi.org/10.1109/ACCESS.2024.3388489
Journal volume & issue
Vol. 12
pp. 55744 – 55756

Abstract

Read online

Software defect identification (SDI) is a key part of improving the quality of software projects and lowering the risks that along with maintenance. It does identify the software defect causes that have not been reached yet to get sufficient results. On the other hand, many researchers have recently developed several models, including NN, ML, DL, advanced CNN, and LSTM, to enhance the effectiveness of defect prediction. Due to an insufficient dataset size, repeated investigations, and no longer appropriate baseline selection, the research on the CNN model was unable to produce reliable results. In addition, XAI a well-known explainability approach creates deep models in computer vision, as well as successfully handles the software defect prediction that is easy for humans to understand. To address these issues, firstly we have used SMOTE for preprocessing which was collected from the NASA repository; categorical and numerical data. Secondly, we have experimented with software defect prediction using 1D-CNN and 2D-CNN named lightweight CNN (LCNN). Subsequently, evaluation we have employed a 100-repetition holdout validation. For the cross-validation setup, we utilized the 1D-CNN model was $20\times 1$ , and for the 2D-CNN model, it was $4\times 5 \times 1$ . After that, the results of the experiment were compared and assessed in terms of accuracy, MSE, and AUC. The result shows that 2D-CNN shows 1.36% better contrast with 1D-CNN. Thirdly, we have conducted research on the identification of software defect features via LIME and SHAP in XAI stand as state-of-the-art techniques. However, we cannot use 2D-CNN because it involves more complex relationships, making it challenging to create transparent explanations. That is why we have realized that 1D-CNN will superior result to explain the root cause of software feature identifications. Finally, LIME provides accurate visualization of software defect features in contrast with SHAP, as well as it helps the stakeholders of the software industry easily find actual root causes of software defect identification.

Keywords