IEEE Access (Jan 2024)

DeepCNN: A Dual Approach to Fault Localization and Repair in Convolutional Neural Networks

  • Mohammad Wardat,
  • Abdullah Al-Alaj

DOI
https://doi.org/10.1109/ACCESS.2024.3384981
Journal volume & issue
Vol. 12
pp. 50321 – 50334

Abstract

Read online

Deep learning models, particularly Convolutional Neural Networks (CNNs), play a pivotal role in intelligent software. However, like any software application, CNN-based applications are susceptible to bugs. Bug-fix patterns in CNN differ from traditional techniques, primarily due to their inherent black box nature. Moreover, current methods, although tailored for generic DNN structures, are time consuming, require specialized expertise, and are not directly applicable to the unique structure and requirements of CNNs. To address these issues, we propose DeepCNN, an innovative automated tool to identify the root causes of CNN faults and autonomously fix prevalent training faults that impact the performance and efficiency of CNN programs. DeepCNN, a data-driven tool, employs a transformer encoder model to abstract token-level CNN code, enabling it to effectively detect and fix bugs in real-world CNN models. Beyond mere identification, DeepCNN offers automated solutions for repairing CNN hyperparameter misconfigurations using an optimization solver. Additionally, by analyzing data dependencies and employing a search algorithm, it determines the most optimal hyperparameter values for the problematic models, ultimately generating patch fixes. Through rigorous evaluation on 36 diverse buggy models, DeepCNN outperformed existing methods in fault localization and repair, showing a robustness in identifying potential problems with a 90% detection rate. Among these problematic models, it manages to repair 90% of them — resulting in a 30% average accuracy boost.

Keywords