Mathematical Biosciences and Engineering (May 2020)

Robust table recognition for printed document images

  • Qiaokang Liang ,
  • Jianzhong Peng,
  • Zhengwei Li,
  • Daqi Xie,
  • Wei Sun,
  • Yaonan Wang,
  • Dan Zhang

DOI
https://doi.org/10.3934/mbe.2020182
Journal volume & issue
Vol. 17, no. 4
pp. 3203 – 3223

Abstract

Read online

The recognition and analysis of tables on printed document images is a popular research field of the pattern recognition and image processing. Existing table recognition methods usually require high degree of regularity, and the robustness still needs significant improvement. This paper focuses on a robust table recognition system that mainly consists of three parts: Image preprocessing, cell location based on contour mutual exclusion, and recognition of printed Chinese characters based on deep learning network. A table recognition app has been developed based on these proposed algorithms, which can transform the captured images to editable text in real time. The effectiveness of the table recognition app has been verified by testing a dataset of 105 images. The corresponding test results show that it could well identify high-quality tables, and the recognition rate of low-quality tables with distortion and blur reaches 81%, which is considerably higher than those of the existing methods. The work in this paper could give insights into the application of the table recognition and analysis algorithms.

Keywords