Journal of Sensor and Actuator Networks (Oct 2022)

Enhancing Optical Character Recognition on Images with Mixed Text Using Semantic Segmentation

  • Shruti Patil,
  • Vijayakumar Varadarajan,
  • Supriya Mahadevkar,
  • Rohan Athawade,
  • Lakhan Maheshwari,
  • Shrushti Kumbhare,
  • Yash Garg,
  • Deepak Dharrao,
  • Pooja Kamat,
  • Ketan Kotecha

DOI
https://doi.org/10.3390/jsan11040063
Journal volume & issue
Vol. 11, no. 4
p. 63

Abstract

Read online

Optical Character Recognition has made large strides in the field of recognizing printed and properly formatted text. However, the effort attributed to developing systems that are able to reliably apply OCR to both printed as well as handwritten text simultaneously, such as hand-filled forms, is lackadaisical. As Machine printed/typed text follows specific formats and fonts while handwritten texts are variable and non-uniform, it is very hard to classify and recognize using traditional OCR only. A pre-processing methodology employing semantic segmentation to identify, segment and crop boxes containing relevant text on a given image in order to improve the results of conventional online-available OCR engines is proposed here. In this paper, the authors have also provided a comparison of popular OCR engines like Microsoft Cognitive Services, Google Cloud Vision and AWS recognitions. We have proposed a pixel-wise classification technique to accurately identify the area of an image containing relevant text, to feed them to a conventional OCR engine in the hopes of improving the quality of the output. The proposed methodology also supports the digitization of mixed typed text documents with amended performance. The experimental study shows that the proposed pipeline architecture provides reliable and quality inputs through complex image preprocessing to Conventional OCR, which results in better accuracy and improved performance.

Keywords