Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network

Rashid Jahangir; Ying Wah TEh; Nisar Ahmed Memon; Ghulam Mujtaba; Mahdi Zareei; Uzair Ishtiaq; Muhammad Zaheer Akhtar; Ihsan Ali

doi:10.1109/ACCESS.2020.2973541

IEEE Access (Jan 2020)

Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network

Rashid Jahangir,
Ying Wah TEh,
Nisar Ahmed Memon,
Ghulam Mujtaba,
Mahdi Zareei,
Uzair Ishtiaq,
Muhammad Zaheer Akhtar,
Ihsan Ali

Affiliations

Rashid Jahangir: ORCiD; Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Ying Wah TEh: ORCiD; Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Nisar Ahmed Memon: ORCiD; College of Computer Sciences and Information Technology (CCSIT), King Faisal University, Al Ahsa, Saudi Arabia
Ghulam Mujtaba: ORCiD; Department of Computer Science, Center of Excellence for Robotics, Artificial Intelligence and Blockchain, Sukkur IBA University, Sukkur, Pakistan
Mahdi Zareei: ORCiD; Escuela de Ingeniería y Ciencias, Zapopan, Tecnológico de Monterrey, Zapopan, Mexico
Uzair Ishtiaq: ORCiD; Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Muhammad Zaheer Akhtar: ORCiD; Department of Computer Science, Vehari Campus, COMSATS University Islamabad, Vehari, Pakistan
Ihsan Ali: ORCiD; Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia

DOI: https://doi.org/10.1109/ACCESS.2020.2973541
Journal volume & issue: Vol. 8
pp. 32187 – 32202

Abstract

Read online

Speaker identification refers to the process of recognizing human voice using artificial intelligence techniques. Speaker identification technologies are widely applied in voice authentication, security and surveillance, electronic voice eavesdropping, and identity verification. In the speaker identification process, extracting discriminative and salient features from speaker utterances is an important task to accurately identify speakers. Various features for speaker identification have been recently proposed by researchers. Most studies on speaker identification have utilized short-time features, such as perceptual linear predictive (PLP) coefficients and Mel frequency cepstral coefficients (MFCC), due to their capability to capture the repetitive nature and efficiency of signals. Various studies have shown the effectiveness of MFCC features in correctly identifying speakers. However, the performances of these features degrade on complex speech datasets, and therefore, these features fail to accurately identify speaker characteristics. To address this problem, this study proposes a novel fusion of MFCC and time-based features (MFCCT), which combines the effectiveness of MFCC and time-domain features to improve the accuracy of text-independent speaker identification (SI) systems. The extracted MFCCT features were fed as input to a deep neural network (DNN) to construct the speaker identification model. Results showed that the proposed MFCCT features coupled with DNN outperformed existing baseline MFCC and time-domain features on the LibriSpeech dataset. In addition, DNN obtained better classification results compared with five machine learning algorithms that were recently utilized in speaker recognition. Moreover, this study evaluated the effectiveness of one-level and two-level classification methods for speaker identification. The experimental results showed that two-level classification presented better results than one-level classification. The proposed features and classification model for identifying a speaker can be widely applied to different types of speaker datasets.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords