Iraqi Journal for Computer Science and Mathematics (Jan 2022)

Detecting The Speaker Language Using CNN Deep Learning Algorithm

  • Fawziya M. Rammo,
  • Mohammed N. Al-Hamdani

DOI
https://doi.org/10.52866/ijcsm.2022.01.01.005
Journal volume & issue
Vol. 3, no. 1

Abstract

Read online

Many languages identification (LID) systems rely on language models that use machine learning (ML) approaches, LID systems utilize rather long recording periods to achieve satisfactory accuracy. This study aims to extract enough information from short recording intervals in order to successfully classify the spoken languages under test. The classification process is based on frames of (2-18) seconds where most of the previous LID systems were based on much longer time frames (from 3 seconds to 2 minutes). This research defined and implemented many low-level features using MFCC (Mel-frequency cepstral coefficients), containing speech files in five languages (English. French, German, Italian, Spanish), from voxforge.org an open-source corpus that consists of user-submitted audio clips in various languages, is the source of data used in this paper. A CNN (convolutional Neural Networks) algorithm applied in this paper for classification and the result was perfect, binary language classi?cation had an accuracy of 100%, and five languages classi?cation with six languages had an accuracy of 99.8%.

Keywords