IEEE Access (Jan 2025)

A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning

  • Songfeng Liang,
  • Kai Xu,
  • Zhurong Dong

DOI
https://doi.org/10.1109/ACCESS.2025.3529469
Journal volume & issue
Vol. 13
pp. 11754 – 11764

Abstract

Read online

Dialogue policy is a critical research area in human-computer interaction, vital for guiding dialogue generation and improving controllability and interpretability. Multi-agent dialogue policy learning demonstrates superior learning speed and exploration capabilities, positioning it as a promising approach for developing more effective and adaptive dialogue agents. However, many studies neglect to holistically model collaboration between agents, which limits the effectiveness of policy learning. Therefore, this paper proposes a new multi-agent group collaboration mechanism for dialogue policy learning, named GMPL. Concretely, we employ an Actor-Critic network to implement the proposed model, alternately updating individual dialogue agents to optimize policy selection. In each update, we utilize the maximum action value function to determine the appropriate dialogue action, while the maximum state value function serves to guide the policy learning process. This integrated approach ensures that both decision-making and learning phases are effectively aligned, thereby enhancing the overall performance of the dialogue agents. Furthermore, we conduct a theoretical analysis of the convergence properties of the proposed model. Experiments were conducted on two distinct task-oriented dialogue datasets, revealing that the proposed multi-agent model exhibits a significantly faster learning speed and a higher dialogue success rate compared to baseline approaches.

Keywords