IEEE Access (Jan 2023)
Enhancing Biometric Speaker Recognition Through MFCC Feature Extraction and Polar Codes for Remote Application
Abstract
While extensive research has been conducted in the field of biometrics, particularly in face and fingerprint recognition, remote speaker recognition has yet to gain global acceptance due to challenges related to accuracy and data integrity. Previous studies in speaker recognition have explored techniques such as Mel Frequency Cepstral Coefficients (MFCC) and Convolutional Neural Networks (CNN), yielding accuracy rates of 90.4% and 92.8%, respectively over a fixed and small database with a standalone system. To address the data integrity and accuracy issues for enhancement in remote speaker recognition, a novel approach is proposed in this paper. Initially, remote speaker recognition is implemented using a client-server setup, but the presence of channel noise hindered any noticeable improvement in accuracy compared to existing methods. The new approach involves extracting MFCC parameters from voice samples and subsequently applying polar error-correcting coding techniques for storage as well as transmission to achieve fidelity. Using a code rate of 1/2 and a block length of 1024 bits, the transmission of polar-coded MFCC features over a noisy channel yielded a lower bit error rate when coupled with successive list decoding. Simulation results demonstrate a reduction in bit error rate, resulting in an accuracy of 95.2% in the implemented remote speaker recognition system. This represents a significant 5% improvement over the existing standalone system that uses uncoded MFCC features. These findings highlight that the Polar codes can be effectively utilized in speaker recognition systems to enhance their robustness and reliability, especially in scenarios with noisy channels or challenging conditions.
Keywords