Applied Sciences (Nov 2023)

Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration

  • Everistus Zeluwa Orji,
  • Ali Haydar,
  • İbrahim Erşan,
  • Othmar Othmar Mwambe

DOI
https://doi.org/10.3390/app132212503
Journal volume & issue
Vol. 13, no. 22
p. 12503

Abstract

Read online

This paper comprehensively assesses the application of active learning strategies to enhance natural language processing-based optical character recognition (OCR) models for image-to-LaTeX conversion. It addresses the existing limitations of OCR models and proposes innovative practices to strengthen their accuracy. Key components of this study include the augmentation of training data with LaTeX syntax constraints, the integration of active learning strategies, and the employment of active learning feedback loops. This paper first examines the current weaknesses of OCR models with a particular focus on symbol recognition, complex equation handling, and noise moderation. These limitations serve as a framework against which the subsequent research methodologies are assessed. Augmenting the training data with LaTeX syntax constraints is a crucial strategy for improving model precision. Incorporating symbol relationships, wherein contextual information is considered during recognition, further enriches the error correction. This paper critically examines the application of active learning strategies. The active learning feedback loop leads to progressive improvements in accuracy. This article underlines the importance of uncertainty and diversity sampling in sample selection, ensuring that the dynamic learning process remains efficient and effective. Appropriate evaluation metrics and ensemble techniques are used to improve the operational learning effectiveness of the OCR model. These techniques allow the model to adapt and perform more effectively in diverse application domains, further extending its utility.

Keywords