Localization of Speaker using Fusion Techniques and Neural Network Algorithms

sawsan jaddoa; Rasha Ali; Mohammed  Najm Abdullah; Buthainah  F. Abed

doi:10.31185/wjps.399

Wasit Journal for Pure Sciences (Jun 2024)

Localization of Speaker using Fusion Techniques and Neural Network Algorithms

sawsan jaddoa,
Rasha Ali,
Mohammed Najm Abdullah,
Buthainah F. Abed

Affiliations

sawsan jaddoa: University of Baghdad- College of Education for Women- Computer department, Baghdad, Iraq
Rasha Ali: University of Baghdad- College of Education for Women- Computer department, Baghdad, Iraq.
Mohammed Najm Abdullah: University of Technology, College of Engineering, Department of computer Engineering, Baghdad, Iraq.
Buthainah F. Abed: University of Baghdad- College of Education for Women- Computer department, Baghdad, Iraq

DOI: https://doi.org/10.31185/wjps.399
Journal volume & issue: Vol. 3, no. 2

Abstract

Read online

ABSTACT Sound source localization especially speech and speaker is sole of the most significant techniques recently because used in various applications like smart environments, industry, robots, and audio conferences. So, the usage of these techniques needs more accuracy. In this paper, a speaker localization proposed it depends on the speech signals in closed spaces by employing fusion techniques and neural networks (NN) algorithms to get more accuracy. The proposed work included finding the classification of the speaker signals, which included three phases: the preprocessing phase, the phase of the feature extraction and classification phase. Data Fusion technique used to generate the dataset of speakers. In feature extraction phase features fusion technique was used for constructing a feature vector by using Generalized Cross Correlation (GCC) for time delay estimation, Root_MUSIC, and Minimum Variance Distortion Less (MVDR) for a direction of arrival for the signal source. In the classification stage two NN algorithms used, Restricted Boltzmann Machine (RBM), which implemented using Tensor flow library and Long Short-Term Memory (LSTM), which implemented using Keras library. The experiments results shows that the accuracy of the two methods was 99.84%, 99.15% for RBM, and LSTM respectively.

Speaker localization, Data fusion, feature fusion, RBM, LSTM.

Published in Wasit Journal for Pure Sciences

ISSN: 2790-5233 (Print); 2790-5241 (Online)
Publisher: College of Education for Pure Sciences
Country of publisher: Iraq
LCC subjects: Science
Website: https://wjps.uowasit.edu.iq/index.php/wjps/index

About the journal

Abstract

Keywords