IEEE Access (Jan 2023)

Musi-ABC for Predicting Musical Emotions

  • Jing Yang

DOI
https://doi.org/10.1109/ACCESS.2023.3300042
Journal volume & issue
Vol. 11
pp. 79455 – 79465

Abstract

Read online

To address the issues of insufficient accuracy and low training efficiency in general musical emotion prediction models, we propose the muSi-ABC architecture for predicting music emotions. Specifically, in the feature extraction stage of music emotions, we use a benchmark feature set to ensure that the extracted music emotion features adhere to standardization. In the prediction stage, we introduce the muSi-ABC architecture which first utilizes a 2D-ConvNet (two dimensional-Convolutional Neural network) to extract partial critical features in music emotions. Then, the BiLSTM (Bi-directional Long Short Term Memory) neural network is employed to learn contextual sequential information of past and future music emotions from the obtained partial critical features. Furthermore, the SA (Self-Attention) module is applied to obtain the complete critical features highly relevant to music emotions, thereby improving prediction accuracy and training efficiency. Through ablation experiments conducted at different time term lengths, the roles of ConvNet model and SA module, as well as the advantages of the proposed muSi-ABC architecture over other ablated models in terms of training efficiency and prediction accuracy, are verified. Additionally, it is observed that representing music emotions using long term feature information for the same song can enhance prediction accuracy. Finally, contrast experimental results demonstrate that the proposed architecture outperforms other benchmark methods in terms of prediction accuracy. Moreover, it is validated that the outlier points contained in the music emotions features extracted based on the benchmark feature set help discover the variations trends of music emotions.

Keywords