IMPROVED DESIGN OF DTW AND GMM CASCADED ARABIC SPEAKER

Shuoshuo Chen; Junbo Zhao; Ruiqi Yang

doi:10.21609/jiki.v6i2.221

Jurnal Ilmu Komputer dan Informasi (Nov 2013)

IMPROVED DESIGN OF DTW AND GMM CASCADED ARABIC SPEAKER

Shuoshuo Chen,
Junbo Zhao,
Ruiqi Yang

Affiliations

Shuoshuo Chen
Junbo Zhao
Ruiqi Yang

DOI: https://doi.org/10.21609/jiki.v6i2.221
Journal volume & issue: Vol. 6, no. 2
pp. 39 – 44

Abstract

Read online

In this paper, we discuss about the design, implementation and assessment of a two-stage Arabic speaker recognition system, which aims to recognize a target Arabic speaker among several people. The first stage uses improved DTW (Dynamic Time Warping) algorithm and the second stage uses SA-KM-based GMM (Gaussian Mixture Model). MFCC (Mel Frequency Cepstral Coefficients) and its differences form, as acoustic feature, are extracted from the sample speeches. DTW provides three most possible speakers and then the recognition results are conveyed to GMM training processes. A specified similarity assessment algorithm, KL distance, is applied to find the best match with the target speaker. Experimental results show that text-independent recognition rate of the cascaded system reaches 90 percent.

Published in Jurnal Ilmu Komputer dan Informasi

ISSN: 2088-7051 (Print); 2502-9274 (Online)
Publisher: Universitas Indonesia
Country of publisher: Indonesia
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://jiki.cs.ui.ac.id/index.php/jiki

About the journal