Data Augmentation for Voiceprint Recognition Using Generative Adversarial Networks

Yao-San Lin; Hung-Yu Chen; Mei-Ling Huang; Tsung-Yu Hsieh

doi:10.3390/a17120583

Algorithms (Dec 2024)

Data Augmentation for Voiceprint Recognition Using Generative Adversarial Networks

Yao-San Lin,
Hung-Yu Chen,
Mei-Ling Huang,
Tsung-Yu Hsieh

Affiliations

Yao-San Lin: Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan
Hung-Yu Chen: Department of Information Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan
Mei-Ling Huang: Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan
Tsung-Yu Hsieh: Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan

DOI: https://doi.org/10.3390/a17120583
Journal volume & issue: Vol. 17, no. 12
p. 583

Abstract

Read online

Voiceprint recognition systems often face challenges related to limited and diverse datasets, which hinder their performance and generalization capabilities. This study proposes a novel approach that integrates generative adversarial networks (GANs) for data augmentation and convolutional neural networks (CNNs) with mel-frequency cepstral coefficients (MFCCs) for voiceprint classification. Experimental results demonstrate that the proposed methodology improves recognition accuracy by up to 15% in low-resource scenarios. The optimal ratio of real-to-GAN-generated samples was determined to be 3:2, which balanced dataset diversity and model performance. In specific cases, the model achieved an accuracy of 96.6%, showcasing its effectiveness in capturing unique voice characteristics while mitigating overfitting. These results highlight the potential of combining GAN-augmented data and CNN-based classification to enhance voiceprint recognition in diverse and resource-constrained environments.

Published in Algorithms

ISSN: 1999-4893 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/algorithms

About the journal

Abstract

Keywords