IEEE Access (Jan 2018)

A Novel Approach for Video Text Detection and Recognition Based on a Corner Response Feature Map and Transferred Deep Convolutional Neural Network

  • Wei Lu,
  • Hongbo Sun,
  • Jinghui Chu,
  • Xiangdong Huang,
  • Jiexiao Yu

DOI
https://doi.org/10.1109/ACCESS.2018.2851942
Journal volume & issue
Vol. 6
pp. 40198 – 40211

Abstract

Read online

The text presented in videos contains important information for content analysis, indexing, and retrieval of videos. The key technique for extracting this information is to find, verify, and recognize video text in various languages and fonts against complex backgrounds. In this paper, we propose a novel method that combines a corner response feature map and transferred deep convolutional neural networks for detecting and recognizing video text. First, we use a corner response feature map to detect candidate text regions with a high recall. Next, we partition the candidate text regions into candidate text lines by projection analysis using two alternative methods. We then construct classification networks transferred from VGG16, ResNet50, and InceptionV3 to eliminate false positives. Finally, we develop a novel fuzzy c-means clustering-based separation algorithm to obtain a clean text layer from complex backgrounds so that the text is correctly recognized by commercial optical character recognition software. The proposed method is robust and has good performance on video text detection and recognition, which was evaluated on three publicly available test data sets and on the high-resolution test data set we constructed.

Keywords