Convolutional Autoencoder for Reconstruction of Historical Document Images: Ancient Manuscript Babad Lombok

Fahmi Syuhada; Asno Azzawagama Firdaus; Ana Tsalitsatun Ni'mah; Yuan Sa’adatai; Muhammad Tajuddin

doi:10.21107/rekayasa.v17i1.26101

Rekayasa (Jun 2024)

Convolutional Autoencoder for Reconstruction of Historical Document Images: Ancient Manuscript Babad Lombok

Fahmi Syuhada,
Asno Azzawagama Firdaus,
Ana Tsalitsatun Ni'mah,
Yuan Sa’adatai,
Muhammad Tajuddin

Affiliations

Fahmi Syuhada: Program Studi Ilmu Komputer Universitas Qamarul Huda Badaruddin Bagu
Asno Azzawagama Firdaus: Program Studi Ilmu Komputer Universitas Qamarul Huda Badaruddin Bagu
Ana Tsalitsatun Ni'mah: Pendidikan Informatika Universitas Trunojoyo Madura
Yuan Sa’adatai: Program Studi Ilmu Komputer Universitas Qamarul Huda Badaruddin Bagu
Muhammad Tajuddin: Ilmu Komputer Universitas Bumi Gora

DOI: https://doi.org/10.21107/rekayasa.v17i1.26101
Journal volume & issue: Vol. 17, no. 1
pp. 175 – 185

Abstract

Read online

The Babad Lombok is an ancient literary or manuscripts document that generally contains stories about the origins of the people of Lombok. This document is written on a lontar leaf, which in the past was used to write manuscripts, letters, and documents. At present, the Babad Lombok document can be seen in the form of photos or scans, so it can be viewed without having to go to a museum or cultural heritage site where the document is usually exhibited. However, because this document is an ancient artifact that has been around for hundreds of years, it has naturally experienced fading in the original document or its scanned versions. This makes the text inside less clear. This paper proposes to automatically reconstruct/repair the Babad Lombok document using a neural network. The type of neural network used is an Autoencoder or Convolutional Autoencoder (CAE). The CAE model is built sequentially and trained using original images of Babad Lombok as its training data and manually corrected images of Babad Lombok as the target or ground truth data. In the process, the two types of data are iteratively cropped to a size of 64x64 along the original size of the Babad Lombok image. This process results in input and target data for the CAE training process in this research, each consisting of 48,288 images. Testing the trained autoencoder model shows that the Babad images have been successfully repaired, making the text quality clearer before reconstruction. Ultimately, the proposed CAE has achieved training and validation accuracies of 89.09% and 94.57%, with corresponding loss values of 0.0418 and 0.0226.

Published in Rekayasa

ISSN: 0216-9495 (Print); 2502-5325 (Online)
Publisher: Lembaga Penelitian dan Pengabdian kepada Masyarakat
Country of publisher: Indonesia
LCC subjects: Technology
Website: https://journal.trunojoyo.ac.id/rekayasa

About the journal

Abstract

Keywords