Mağallaẗ Al-kūfaẗ Al-handasiyyaẗ (Aug 2024)
A DATASET FOR THE DIACRITICS IMAGES OF THE HOLY QURAN: TOWARDS A TACTILE-VISION SYSTEM FOR THE VISUALLY IMPAIRED
Abstract
The only way for the Blind and visually impaired persons to read the holy Quran is by using a special paper edition of embossed Braille code. A new approach to reading the Holy Quran text for those persons was proposed by the author. One of the main demands for this approach is the classification of the dots and diacritics, which can be abbreviated as DaDs. This paper outlines the creation of a dataset of images for the DaDs of Al-Mushaf. A handheld scanner was developed for this purpose, and MATLAB programs were employed for DaDs segmentation. The final goal is the design of a tactile vision sensory substitution system based on an optical character recognition technique to help blind individuals read the Holy Quran in an alternative way to Braille codes. Approximately 1750 images were taken from two distinct Al-Mushaf versions with the proposed handheld scanner. Using the suggested techniques and algorithms, 6000 DaDs were retrieved from these images; however, only 4710 images of the DaDs, arranged in 22 classes, were selected after the repeated, incomplete, and non-DaDs were eliminated. The dataset was organized by DaD class, prepared to be used directly for machine learning purposes, and made available for public use upon request.
Keywords