Frontiers in Artificial Intelligence (Dec 2024)
Efficient dataset extension using generative networks for assessing degree of coating degradation around scribe
Abstract
A novel methodology for dataset augmentation in the semantic segmentation of coil-coated surface degradation is presented in this study. Deep convolutional generative adversarial networks (DCGAN) are employed to generate synthetic input-target pairs, which closely resemble real-world data, with the goal of expanding an existing dataset. These augmented datasets are used to train two state-of-the-art models, U-net, and DeepLabV3, for the precise detection of degradation areas around scribes. In a series of experiments, it was demonstrated that the introduction of synthetic data improves the models' performance in detecting degradation, especially when the ratio of synthetic to real data is carefully managed. Results indicate that optimal improvements in accuracy and F1-score are achieved when the ratio of synthetic to original data is between 0.2 and 0.5. Moreover, the advantages and limitations of different GAN architectures for dataset expansion are explored, with specific attention to their ability to produce realistic and diverse samples. This work offers a scalable solution to the challenges associated with creating large and diverse annotated datasets for industrial applications of coil coating degradation assessment. The proposed approach provides a significant contribution by improving model generalization and segmentation accuracy while reducing the burden of manual data annotation. These findings have important implications for industries relying on coil coatings, as more efficient and accurate degradation detection methods are enabled.
Keywords