IEEE Access (Jan 2020)

Generative Adversarial Networks in Human Emotion Synthesis: A Review

  • Noushin Hajarolasvadi,
  • Miguel Arjona Ramirez,
  • Wesley Beccaro,
  • Hasan Demirel

DOI
https://doi.org/10.1109/ACCESS.2020.3042328
Journal volume & issue
Vol. 8
pp. 218499 – 218529

Abstract

Read online

Deep generative models have become an emerging topic in various research areas like computer vision and signal processing. These models allow synthesizing realistic data samples that are of great value for both academic and industrial communities. Affective computing, a topic of a broad interest in computer vision society, has been no exception and has benefited from this powerful approach. In fact, affective computing observed a rapid derivation of generative models during the last two decades. Applications of such models include but are not limited to emotion recognition and classification, unimodal emotion synthesis, and cross-modal emotion synthesis. As a result, we conducted a comprehensive survey of recent advances in human emotion synthesis by studying available databases, advantages, and disadvantages of the generative models along with the related training strategies considering two principal human communication modalities, namely audio and video. In this context, facial expression synthesis, speech emotion synthesis, and the audio-visual (cross-modal) emotion synthesis are reviewed extensively under different application scenarios. Gradually, we discuss open research problems to push the boundaries of this research area for future works. As conclusions, we indicate common problems that can be explored from the Generative Adversarial Networks (GAN) topologies and applications in emotion synthesis.

Keywords