PLoS ONE (Jan 2023)

CEG: A joint model for causal commonsense events enhanced story ending generation.

  • Yushi Zhang,
  • Yan Yang,
  • Ming Gu,
  • Feng Gao,
  • Chengcai Chen,
  • Liang He

DOI
https://doi.org/10.1371/journal.pone.0286049
Journal volume & issue
Vol. 18, no. 5
p. e0286049

Abstract

Read online

With the success of pre-trained language models, the performance of story ending generation has been dramatically improved while remaining challenging due to the lack of commonsense reasoning ability. Most previous works mainly focus on using commonsense knowledge to enhance the implicit correlations between words but ignore the hidden causality of sentences or events. In this paper, we propose Causal commonsense Enhanced joint model for story ending Generation (CEG), which incorporates causal commonsense events knowledge to generate a reasonable story ending. Specifically, we first develop a commonsense events inference model trained on GLUCOSE, which converts static knowledge into a dynamic generation model to discover unseen knowledge. It uses prompts to produce various commonsense events behind the stories as pseudo-labels of the dataset. Then, we propose a joint model for the causal events inference task and the story ending generation task to inject inference knowledge into the generation, which consists of a shared encoder, an inference decoder, and a generation decoder. In the causal events inference task, we use the shared encoder and the inference decoder to reason the causal events behind each sentence of the story context to help the model better understand the story and provide long-distance dependencies for the story ending generation. In story ending generation, we combine the hidden states of the causal events with the story context to generate the story ending by the shared encoder and the generation decoder. We jointly train the model on two tasks so that the generation decoder produces the story endings that better match the clues. Experimental results on the ROCStories dataset show that our model outperforms the previous works, demonstrating the effectiveness of the joint model and the generated causal events.