E3S Web of Conferences (Jan 2023)
Deep Learning-based Speech Emotion Recognition: An Investigation into a sustainably Emotion-Speech Relationship
Abstract
Speech Emotion Recognition (SER) poses a significant challenge with promising applications in psychology, speech therapy, and customer service. This research paper proposes the development of an SER system utilizing machine learning techniques, particularly deep learning and recurrent neural networks. The model will be trained on a carefully labeled dataset of diverse speech samples representing various emotions. By analyzing crucial audio features such as pitch, rhythm, and prosody, the system aims to achieve accurate emotion recognition for novel speech samples. The primary objective of this paper is to contribute to the advancement of SER by improving accuracy, reliability, and gaining deeper insights into establishing a sustainable complex relationship between emotions and speech. This innovative system has the potential to facilitate the practical implementation of emotion recognition technologies across multiple domains.