Data in Brief (Apr 2023)
HMPLMD: Handwritten Malayalam palm leaf manuscript dataset
Abstract
The realization of high recognition rates of degraded documents such as palm leaf manuscripts primarily relies on document enhancement. Advancement of deep learning models in the process of document enhancement plays a major role among non-deep learning models or thresholding methods. Preparation of readily available ground truth data for creation of deep learning models is of paramount importance as it is highly time consuming task. The ground truth dataset preparation involves greater complexities as ancient documents are affected with degradations such as fungi, humidity, uneven illumination, discoloration, holes, cracks, and other damages. We propose a Handwritten Malayalam Palm Leaf Manuscript Dataset (HMPLMD) and its ground truth data aspiring for advancements in the field of palm leaf image analysis. We employ the palm leaf manuscripts of Kambaramayanam and Jathakas for the sake of experimentations. The proposed ground truth samples of degraded palm leaves plays a crucial role in creation of specialized deep/transfer learning models to handle challenges related to binarization.