Research in Statistics (Jul 2024)

Biometric voice recognition system in the context of multiple languages: using traditional means of identification of individuals in Nigeria languages and English language

  • Edmund Nnabueze Ajimah,
  • Ogechukwu N. Iloanusi

DOI
https://doi.org/10.1080/27684520.2024.2362298
Journal volume & issue
Vol. 2, no. 1

Abstract

Read online

Voice biometrics is challenging in many aspects, ranging from voice data acquisition through processing down to the matching module. Some of the challenges of an automatic voice biometric are background noises, mimicry, voice playback, and so on. This research work emphasizes identifying a person in the context of varying languages. Mel-Frequency Cepstral Coefficient (MFCC), Gammatonne Cepstral Coefficient (GTCC), the Pitch, and the iVector are feature extractor techniques that are amazing for differentiating individuals. In this research work, a new feature extractor derived from Pitch, GTCC, and MFCC is proposed. A novel AfroVoices database of 94 subjects was collected with ten (10) voiceprints of Nigerian localities: five (5) spoken in English and five (5) in vernacular, resulting in a total of 940 voiceprints. The experiment was performed on three datasets AfroVoices, UNN_BVC, and LibriSpeech voice datasets, which were done in the MATLAB environment. The experiment was performed using the traditional approach of recognizing individuals using the biometric system. Experimenting reveals that varying languages do not affect voice biometric performance. It also reveals that the presence of noise affected the performance of the system, as the clean utterances performed relatively better.

Keywords