Jisuanji kexue yu tansuo (Jun 2021)

Summary of Multi-modal Sentiment Analysis Technology

  • LIU Jiming, ZHANG Peixiang, LIU Ying, ZHANG Weidong, FANG Jie

DOI
https://doi.org/10.3778/j.issn.1673-9418.2012075
Journal volume & issue
Vol. 15, no. 6
pp. 1165 – 1182

Abstract

Read online

Sentiment analysis refers to the use of computers to automatically analyze and determine the emotions that people want to express. It can play a significant role in human-computer interaction and criminal investigation and solving cases. The advancement of deep learning and traditional feature extraction algorithms provides conditions for the use of multiple modalities for sentiment analysis. Combining multiple modalities for sentiment analysis can make up for the instability and limitations of single-modal sentiment analysis, and can effectively improve accuracy. In recent years, researchers have used three modalities of facial expression information, text information, and voice information to perform sentiment analysis. This paper mainly summarizes the multi-modal sentiment analysis technology from these three modalities. Firstly, it briefly introduces the basic concepts and research status of multi-modal sentiment analysis. Secondly, it summarizes the commonly used multi-modal sentiment analysis datasets. It gives a brief description of the existing single-modal emotion analysis technology based on facial expression information, text information and voice information. Next, the modal fusion technology is introduced in detail, and the existing results of the multi-modal sentiment analysis technology are mainly described according to different modal fusion methods. Finally, it discusses the problems of multi-modal sentiment analysis and future development direction.

Keywords