EURASIP Journal on Image and Video Processing (Dec 2017)

Textline detection in degraded historical document images

  • Byeongyong Ahn,
  • Jewoong Ryu,
  • Hyung Il Koo,
  • Nam Ik Cho

DOI
https://doi.org/10.1186/s13640-017-0229-7
Journal volume & issue
Vol. 2017, no. 1
pp. 1 – 13

Abstract

Read online

Abstract This paper presents a textline detection method for degraded historical documents. Our method follows a conventional two-step procedure that the binarization is first performed and then the textlines are extracted from the binary image. In order to address the challenges in historical documents such as document degradation, structure noise, and skews, we develop new methods for the binarization and textline extraction. First, we improve the performance of binarization by detecting the non-text regions and processing only text regions. We also improve the textline detection method by extracting main textblock and compensating the skew angle and writing style. Experimental results show that the proposed method yields the state-of-the-art performance for several datasets.

Keywords