Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Feb 2023)
Improved Classification of Handwritten Jawi Script Based on Main Part of Script Body
Abstract
Since the entry of Islam, many ancient relics in the archipelago were written using Jawi script. Due to human or natural factors, these ancient relics will be damaged or destroyed. To avoid the loss of this ancient heritage data, the data must be stored in digital documents. In order to convert digital documents into machine-readable text format, the use of Optical Character Recognition (OCR) technology is inevitable. In this research, OCR technology is implemented on isolated Jawi scripts. Freeman Chain Code (FCC) is used to extract the isolated Jawi script features. Subsequently, the FCC feature is fed into Support Vector Machine (SVM) in order to classify the character. The decision rule classification is applied to the class of SVM classification in the Jawi script form. The results of the SVM classification into 19 classes reached 81.58%, while the results for merging into 15 classes produced better results with the accuracy 84.21%. Feature extraction of dot location is divided into the top, middle, and bottom. Feature extraction of the number of dotss is done by counting the number of dots, while feature extraction of the presence of holes is carried out by detecting the presence of holes in the characters. These features are applied to the class of results from SVM classification with decision-making rules. The percentage of success in applying the decision rules to the results of the classification of incorporation into 15 classes by SVM reached 92.86%. Further research will be conducted to determine the effect of the feature of the location of the dot and the number of dots on the shape of the main part of the character.
Keywords