IEEE Access (Jan 2023)

Reinforcement Learning Agents Playing Ticket to Ride–A Complex Imperfect Information Board Game With Delayed Rewards

  • Shuo Yang,
  • Michael Barlow,
  • Thomas Townsend,
  • Xuejie Liu,
  • Dilini Samarasinghe,
  • Erandi Lakshika,
  • Glennn Moy,
  • Timothy Lynar,
  • Benjamin Turnbull

DOI
https://doi.org/10.1109/ACCESS.2023.3287100
Journal volume & issue
Vol. 11
pp. 60737 – 60757

Abstract

Read online

Board games are extensively studied in the AI community because of their ability to reflect/represent real-world problems with a high-level of abstraction, and their irreplaceable role as testbeds of state-of-the-art AI algorithms. Modern board games are commonly featured with partially observable state spaces and imperfect information. Despite some recent successes in AI tackling perfect information board games like chess and Go, most imperfect information games are still challenging and have yet to be solved. This paper empirically explores the capabilities of a state-of-the-art Reinforcement Learning (RL) algorithm – Proximal Policy Optimization (PPO) in playing Ticket to Ride, a popular board game with features of imperfect information, large state-action space, and delayed rewards. This paper explores the feasibility of the proposed generalizable modelling and training schemes using a general-purpose RL algorithm with no domain knowledge-based heuristics beyond game rules, game states and scores to tackle this complex imperfect information game. The performance of the proposed methodology is demonstrated in a scaled-down version of Ticket to Ride with a range of RL agents obtained with different training schemes. All RL agents achieve clear advantages over a set of well-designed heuristic agents. The agent constructed through a self-play training scheme outperforms the other RL agents in a Round Robin tournament. The high performance and versality of this self-play agent provide a solid demonstration of the capabilities of this framework.

Keywords