IEEE Access (Jan 2019)

Word-Based POMDP Dialog Management via Hybrid Learning

  • Shuyu Lei,
  • Xiaojie Wang,
  • Caixia Yuan

DOI
https://doi.org/10.1109/ACCESS.2019.2903863
Journal volume & issue
Vol. 7
pp. 39236 – 39243

Abstract

Read online

Dialog management plays an important role in the task-oriented dialog system. Most of the previous works divide dialog management into state tracker and action selector. The two parts are modeled separately and implemented in a pipelined way, which suffers from the problem of error accumulation, and the feedback signal from action selector cannot be propagated to state tracker and natural language understanding module. This paper proposes a word-based partially observable Markov decision processes' dialog management that integrates natural language understanding, state tracker, and action selector into an end-to-end architecture. Our proposed dialog management takes the words from user utterances as inputs and then produces optimal action as well as slot values of natural language understanding which are necessary for response generation. To this end, we propose a hybrid learning method, which integrates reinforcement learning and supervised learning, to optimize the action selector and slot filler jointly. In addition, we develop a high-return prioritized experience replay to speed up the convergence of the training process. The experimental results show that the proposed dialog management outperforms four strong baselines in a series of different dialog tasks. A human user's evaluation also shows the same results. The high-return prioritized experience replay accelerates the convergence effectively, especially in the scenario in which the proposed dialog management works on more complex tasks.

Keywords