IEEE Access (Jan 2024)

A Terminal Tube Text Detection and Recognition Method Based on Improved YOLOv7-Tiny and CRNN

  • Liao Huilian,
  • Du Xingwei,
  • He Luhang,
  • Wang Shanlei,
  • Yao Meng,
  • Zou Hongbo

DOI
https://doi.org/10.1109/ACCESS.2024.3426654
Journal volume & issue
Vol. 12
pp. 96358 – 96369

Abstract

Read online

In the process of intelligent inspection of substation terminal line, the automatic identification of terminal coding tube is a crucial step. This study proposes a new method for text detection and recognition of terminal coding tube based on deep learning technology. In the text detection stage, we improved the YOLOv7-tiny algorithm for the complex situations of bending, dense arrangement and different lengths of the terminal row encoder tube. The GhostV2 module is used in the backbone network to replace the Efficient Long-Distance Attention Network (ELAN) module, which effectively improves its ability to extract feature information while reducing the number of model parameters. The Coordinate attention (CA) mechanism is embedded before the prediction stage, which strengthens the sensitivity of the network to key feature regions. In the text recognition stage, we optimized and upgraded the Convolutional recurrent neural network (CRNN) recognition model. Firstly, the original VGG16 convolutional layer structure is replaced with the ResNet18 architecture to reduce the loss of feature information; secondly, we further integrate the Funnel Rectified Linear Unit (FReLU) activation function to enhance the feature expression and extraction performance of the network. Experiments show that the improved YOLOv7-tiny model shows excellent performance on the sample data set for detection. The precision rate reaches 96.70 %, the recall rate reaches 89.88 %, and the average accuracy is as high as 95.38 %. On the recognition data set, the accuracy of the improved CRNN method in character recognition has also achieved a good effect of 90.2 %. These results fully verify the efficiency and accuracy of the method in the text detection and recognition task of the terminal coding tube.

Keywords