Data in Brief (Apr 2023)

HMPLMD: Handwritten Malayalam palm leaf manuscript dataset

  • B.J. Bipin Nair,
  • N. Shobha Rani

Journal volume & issue
Vol. 47
p. 108960

Abstract

Read online

The realization of high recognition rates of degraded documents such as palm leaf manuscripts primarily relies on document enhancement. Advancement of deep learning models in the process of document enhancement plays a major role among non-deep learning models or thresholding methods. Preparation of readily available ground truth data for creation of deep learning models is of paramount importance as it is highly time consuming task. The ground truth dataset preparation involves greater complexities as ancient documents are affected with degradations such as fungi, humidity, uneven illumination, discoloration, holes, cracks, and other damages. We propose a Handwritten Malayalam Palm Leaf Manuscript Dataset (HMPLMD) and its ground truth data aspiring for advancements in the field of palm leaf image analysis. We employ the palm leaf manuscripts of Kambaramayanam and Jathakas for the sake of experimentations. The proposed ground truth samples of degraded palm leaves plays a crucial role in creation of specialized deep/transfer learning models to handle challenges related to binarization.

Keywords