IEEE Access (Jan 2021)

Deep Learning-Based Short Story Generation for an Image Using the Encoder-Decoder Structure

  • Kyungbok Min,
  • Minh Dang,
  • Hyeonjoon Moon

DOI
https://doi.org/10.1109/ACCESS.2021.3104276
Journal volume & issue
Vol. 9
pp. 113550 – 113557

Abstract

Read online

Research that applies artificial intelligence (AI) to generate the captions for an image has been extensively studied in recent years. However, the length of these captions was short, and the number of generated captions was limited. In addition, it is unknown whether a short story can be generated based on the image, because many sentences have to be connected to create a fluent short story. As a result, this study introduces an encoder-decoder framework structure to generate a short story captioning (SSCap) using a common image caption dataset and a manually collected story corpus. This manuscript has three main contributions, which include 1) an unsupervised deep learning-based framework that combines a recurrent neural network (RNN) structure and encoder-decoder model for composing a short story for an image, 2) a huge story corpus, which includes two different genres (horror and romantic), manually collected and validated. Extensive experiments demonstrated that short stories created by the proposed model show creative content compared to existing systems that can only make concise sentences. Therefore, the demonstrated framework has the potential to motivate the development of a more robust AI story writer and motivates the integration of the suggested model into practical applications to help the story writers find a new idea.

Keywords