Journal of King Saud University: Computer and Information Sciences (Jun 2024)

Multi-model feature aggregation for classification of laser welding images with vision transformer

  • Nasir Ud Din,
  • Li Zhang,
  • M. Saqib Nawaz,
  • Yatao Yang

Journal volume & issue
Vol. 36, no. 5
p. 102049

Abstract

Read online

The laser welding technique is quite common in various manufacturing lines, including those for lithium-ion power batteries, due to its remarkable productivity, effectiveness, and flexibility. However, it has been observed that the consistency of the welding quality is not always optimal. This study investigates the adoption of laser welding in the manufacturing of lithium-ion batteries, such as their anode, cathode and safety vent, where consistent weld quality is crucial for battery performance and safety. Therefore, evaluating the laser-welding product is indispensable for industrial production and the public domain. Several techniques have been utilized for laser welding evaluation, both destructive and non-destructive. However, these methods are ineffective and too cumbersome to be adopted in mass manufacturing. In contrast, a machine vision strategy has recently been adopted to distinguish between successful and unsuccessful laser welding items. This opened new perspectives for evaluating the quality of the laser welding product using digital image techniques. However, these methods cannot deliver outstanding performance across multiple laser-welding products and achieve optimal classification accuracy. Deep learning has evolved remarkably in recent years and gained popularity in detecting welding defects. This paper presents the observation of a Hybrid Vision Transformer (HViT) to classify the laser welding images corresponding to their feature patch images based on multi-model feature aggregation. The VGG-16 and MobileNet, which were pre-trained on ImageNet, were utilized as the core models to extract the rich features of the laser welding image. To integrate these features, the squeeze excitation module (SE) was utilized, and a multi-layer perception (MLP) approach with a label smoothing optimizer was used for classification. To determine the effectiveness of the proposed strategy, a comparative analysis was conducted with a multitude of alternative machine learning and deep learning approaches. The results indicate that the proposed strategy surpasses all other methods in terms of classification accuracy and evaluation metrics on the test dataset.

Keywords