IEEE Access (Jan 2024)
Utilizing Enhanced Particle Swarm Optimization for Feature Selection in Gender-Emotion Detection From English Speech Signals
Abstract
Speech emotion recognition (SER) plays a vital role in various applications, enabling machines to decode and analyze emotions conveyed through speech. This study introduces a novel approach, Dynamic Gender-Aware Enhanced Binary Particle Swarm Optimization (DGA-EBPSO), that leverages a gender-specific particle swarm optimization (PSO) technique for feature selection to enhance SER accuracy while promoting interpretability. We extract pitch and acoustic-related statistical features from speech samples and develop separate models for gender prediction and emotion detection. The gender-specific DGA-EBPSO algorithm incorporates a hybrid mutation strategy to improve feature selection efficiency and considers gender-based variations in emotional expression. Notably, our approach prioritizes interpretability by allowing for analysis of the features most influential for gender and emotion classification using the DGA-EBPSO framework. This focus on interpretability aligns with the principles of responsible AI development, ensuring transparency and mitigating potential biases in SER systems. The effectiveness of our method is demonstrated by achieving superior classification performance (accuracy, precision, recall, and F1 score) on benchmark emotional speech datasets like CREMA-D, EmergencyCalls, IEMOCAP, and RAVDESS.
Keywords