IEEE Access (Jan 2022)

Korean Drama Scene Transcript Dataset for Emotion Recognition in Conversations

  • Sudarshan Pant,
  • Eunchae Lim,
  • Hyung-Jeong Yang,
  • Guee-Sang Lee,
  • Soo-Hyung Kim,
  • Young-Shin Kang,
  • Hyerim Jang

DOI
https://doi.org/10.1109/ACCESS.2022.3221408
Journal volume & issue
Vol. 10
pp. 119221 – 119231

Abstract

Read online

Understanding emotions in conversation is a challenging task, as the sentences often have an implied meaning that is not generally understood in isolation. Efficient use of contextual information is essential for emotion recognition in conversations. Many published datasets provide contextual information for situations such as text-based online messaging, chatbots, and movie dialogues. However, such dialogue-based datasets are collected by selecting ideal conversational situations and thus do not include many variations in dialogue length and number of participants. Therefore, such datasets may not be applicable for emotion recognition in text-based movie transcripts, where scenes contain variations in the number of speakers and length of spoken sentences. We present a conversation dataset based on the Korean television show transcripts to analyze the emotions in presence of scene context. The Korean Drama Scene Transcript dataset for Emotion Recognition (KD-EmoR) is a text-based conversation dataset. We analyze three classes of complex emotions: euphoria, dysphoria, and neutral, in the scenes of a television drama to build a publicly available dataset for further research. We developed a context-aware deep learning model to classify emotions using the speaker-level context and scene context and achieved an F1-score of 0.63 on the proposed dataset.

Keywords