IEEE Access (Jan 2023)
Equilibrium Optimizer for Emotion Classification From English Speech Signals
Abstract
Speech emotion recognition and its precise classification are challenging tasks that heavily depend on the quality of feature extraction and selection for speech signals. Many feature selection algorithms have been proposed to achieve recognition, however, their accuracy has not reached a satisfactory level. We introduce an improved equilibrium optimizer (iEO) algorithm and utilize mel frequency cepstral coefficients (MFCCs) and pitch features for emotion recognition. The transfer function is used to complete the binarization of iEO (BiEO), and the algorithm adopts multi-swarm and transfer functions to balance global search and local search. The performance of the proposed algorithm is verified using four English speech emotion datasets, eNTERFACE05, ryerson audio-visual database of emotional speech and song (RAVDESS), surrey audio-visual expressed emotion (SAVEE) and toronto emotional speech set (TESS). The experimental results illustrate that the proposed algorithm obtains an accuracy of 0.4923, 0.5581, 0,5575 and 0.9840 in eNTERFACE05, RAVDESS, SAVEE and TESS based on K-nearest neighbors, and an accuracy of 0.5279, 0.5862, 0.6752 and 0.9941 based on random forest.
Keywords