Sistemasi: Jurnal Sistem Informasi (May 2024)
A Robust Gender Recognition System using Convolutional Neural Network on Indonesian Speaker
Abstract
Voice is one of the biometrics that humans have. Humans can be recognized by the sounds produced by their vocal cords and vocal tracts. One of the uses of voice is to recognize gender. Despite extensive research, gender recognition using machine learning remains unsatisfactory due to the complexity of voice features and the limitations of conventional algorithms. In this research, voice-based gender recognition is performed by applying deep learning. The deep learning model used is the Convolutional Neural Network (CNN). The input of CNN is the result of feature extraction from the Mel-Frequency Cepstral Coefficients (MFCC) method. MFCC produces Mel-Spectograms which are important features of sound. The dataset used is Indonesian speech. In the research, there are imbalanced and balanced dataset scenarios to see the performance of the model. To produce a balanced dataset, random undersampling is performed on the majority class. In addition, the effect of dividing training and testing data with a composition of 70:30, 80:20, and 90:10 was observed. The results show that the model has 100% accuracy for all imbalanced dataset scenarios. Then the highest accuracy is 99.65% for the balanced dataset scenario with 70:30 splitting. In summary, it can be concluded that CNN performs very well in identifying gender from voice features overall, although its performance decreases when random undersampling is applied to the dataset.