Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network

Asroni  Asroni; Ku Ruhana Ku-Mahamud; Cahya Damarjati; Hasan Basri  Slamat

doi:10.21123/bsj.2021.18.2(Suppl.).0925

Baghdad Science Journal (Jun 2021)

Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network

Asroni Asroni,
Ku Ruhana Ku-Mahamud ,
Cahya Damarjati,
Hasan Basri Slamat

Affiliations

Asroni Asroni: Universitas Muhammadiyah Yogyakarta, Indonesia
Ku Ruhana Ku-Mahamud: Universiti Utara Malaysia
Cahya Damarjati: Universitas Muhammadiyah Yogyakarta, Indonesia.
Hasan Basri Slamat: Universitas Muhammadiyah Yogyakarta, Indonesia

DOI: https://doi.org/10.21123/bsj.2021.18.2(Suppl.).0925
Journal volume & issue: Vol. 18, no. 2(Suppl.)

Abstract

Read online

Deep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to evaluate the pronunciation of the Arabic alphabet. Voice data from six school children are recorded and used to test the performance of the proposed method. The padding technique has been used to augment the voice data before feeding the data to the CNN structure to developed the classification model. In addition, three other feature extraction techniques have been introduced to enable the comparison of the proposed method which employs padding technique. The performance of the proposed method with padding technique is at par with the spectrogram but better than mel-spectrogram and mel-frequency cepstral coefficients. Results also show that the proposed method was able to distinguish the Arabic alphabets that are difficult to pronounce. The proposed method with padding technique may be extended to address other voice pronunciation ability other than the Arabic alphabets.

Published in Baghdad Science Journal

ISSN: 2078-8665 (Print); 2411-7986 (Online)
Publisher: College of Science for Women, University of Baghdad
Country of publisher: Iraq
LCC subjects: Science
Website: http://bsj.uobaghdad.edu.iq/index.php/BSJ

About the journal

Abstract

Keywords