RJ-TinyViT: an efficient vision transformer for red jujube defect classification

Chengyu Hu; Jianxin Guo; Hanfei Xie; Qing Zhu; Baoxi Yuan; Yujie Gao; Xiangyang Ma; Jialu Chen

doi:10.1038/s41598-024-77333-6

Scientific Reports (Nov 2024)

RJ-TinyViT: an efficient vision transformer for red jujube defect classification

Chengyu Hu,
Jianxin Guo,
Hanfei Xie,
Qing Zhu,
Baoxi Yuan,
Yujie Gao,
Xiangyang Ma,
Jialu Chen

Affiliations

Chengyu Hu: School of Electronic Information, Xijing University
Jianxin Guo: School of Electronic Information, Xijing University
Hanfei Xie: School of Electronic Information, Xijing University
Qing Zhu: Beijing Hengyue Intelligent Information Technology Co., Ltd
Baoxi Yuan: School of Electronic Information, Xijing University
Yujie Gao: School of Electronic Information, Xijing University
Xiangyang Ma: School of Electronic Information, Xijing University
Jialu Chen: School of Electronic Information, Xijing University

DOI: https://doi.org/10.1038/s41598-024-77333-6
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 28

Abstract

Read online

Abstract Compared to the surface defect detection of industrial products produced according to specified processes, the detection of surface defects in naturally grown red jujubes poses unique and significant challenges for researchers. The high diversity of surface defects, subtle distinctions from the background, low contrast, varying scales, and the presence of high levels of noise in images are among the factors that greatly amplify the complexity of defect detection tasks. Existing methods show some deficiencies in addressing these issues, mainly due to insufficient feature extraction capabilities and overly complex network structures, leading to limitations in model efficiency and practical application performance. To tackle the challenges associated with red jujube surface defect detection, this study proposes an optimized Tiny Vision Transformer (TinyViT) network structure, named RJ-TinyViT. This method refines the TinyViT-5 m network structure to reduce network burden and introduces an improved Multi-Kernel Block (MK Block) and an improved Mobile Inverted Bottleneck Convolution Block (MBConv Block) to enhance feature extraction capabilities. Additionally, we have integrated the Coordinate Attention (CA) module to enhance the model’s capacity for recognizing and focusing on features of surface defects on red jujubes. Experimental results show that RJ-TinyViT achieved a classification accuracy of 93.38%, marking an improvement of 1.84% over the original TinyViT network. At the same time, its Floating-point Operations (FLOPs) and Parameters (Params) were reduced to 58.97% and 39.84% of the original TinyViT network, respectively. These results not only demonstrate that RJ-TinyViT achieves model lightweighting while maintaining high accuracy but also highlight its value in practical industrial applications.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords