IEEE Access (Jan 2024)
Incremental Recognition of Multi-Style Tibetan Character Based on Transfer Learning
Abstract
Tibetan script possesses a distinctive artistic form of writing, intricate glyph structures, and diverse stylistic variations. In the task of text recognition, effectively handling the recognition of Tibetan script with significantly different stylistic fonts remains a challenge. Existing research has made considerable progress in recognizing Tibetan script within a single style using techniques such as convolutional neural networks and convolutional recurrent neural networks. However, when dealing with multi-style Tibetan script recognition, the standard approach involves training models using a multi-label joint training method. This approach annotates the style and class of different font style samples and merges them into a single dataset for model training. Nevertheless, as the amount of data and performance requirements increase, this approach gradually faces issues such as decreasing accuracy, insufficient generalization capability, and poor adaptability to new style samples. In this paper, we propose a transfer learning-based method for incremental recognition of multi-style Tibetan script, referred to as “multi-style Tibetan script incremental recognition.” In the style recognition stage, we employ a convolutional neural network (CNN) to accurately differentiate between style categories. During the pre-training stage, we train a residual network on the Tibetan Uchen standard style and utilize it as the baseline model. In the multi-style Tibetan script recognition stage, we integrate transfer learning into the model training process to reduce the training time. These three stages collectively accomplish the task of multi-style Tibetan script incremental recognition. The experimental results demonstrate that our approach achieves a significant improvement in overall recognition accuracy, from 90.14% to 98.40%, when utilizing the TCDB and HUTD datasets compared to traditional multi- task recognition methods. This method exhibits high accuracy, strong generalization capability, and good adaptability to new style samples in multi-style Tibetan script character recognition. Furthermore, it can be applied to other tasks involving multi-style, multi-font, and multi-script recognition.
Keywords