Big Data Analytics (Jan 2018)

Chinese text-line detection from web videos with fully convolutional networks

  • Chun Yang,
  • Wei-Yi Pei,
  • Long-Huang Wu,
  • Xu-Cheng Yin

DOI
https://doi.org/10.1186/s41044-017-0028-2
Journal volume & issue
Vol. 3, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background In recent years, video becomes the dominant resource of information on the Web, where the text within video usually carries significant semantic information. Video text extraction and recognition plays an essential role in web multimedia understanding and retrieval for big visual data analytics and applications. To deal with challenging backgrounds and embedding noises, most conventional approaches usually tend to design sophisticated pre-processing and post-progressing steps before and after text detection. In this paper, we present a simple yet powerful pipeline that directly and uniformly detects Chinese text lines for embedded captions from web videos. Results In this Chinese text-line detection system, a fully convolutional network with local context is adopted to localize via an end-to-end learning way. The produced caption predictions are with the word level that could be directly fed into the character classifier. Text-line construction is then performed by heuristic strategies. A variety of experiments are conducted on several real-world web video datasets and demonstrated the effectiveness and efficiency of our proposed method. Conclusion The proposed system can directly detect the English word and Chinese characters in the caption text-lines without word or character segmentation with the high performance on real-world web video datasets.

Keywords