Hybrid Training Strategies: Improving Performance of Temporal Difference Learning in Board Games

Jesús Fernández-Conde; Pedro Cuenca-Jiménez; José M. Cañas

doi:10.3390/app12062854

Applied Sciences (Mar 2022)

Hybrid Training Strategies: Improving Performance of Temporal Difference Learning in Board Games

Jesús Fernández-Conde,
Pedro Cuenca-Jiménez,
José M. Cañas

Affiliations

Jesús Fernández-Conde: Department of Telematic Systems and Computation, Rey Juan Carlos University, Fuenlabrada, 28942 Madrid, Spain
Pedro Cuenca-Jiménez: Department of Telematic Systems and Computation, Rey Juan Carlos University, Fuenlabrada, 28942 Madrid, Spain
José M. Cañas: Department of Telematic Systems and Computation, Rey Juan Carlos University, Fuenlabrada, 28942 Madrid, Spain

DOI: https://doi.org/10.3390/app12062854
Journal volume & issue: Vol. 12, no. 6
p. 2854

Abstract

Read online

Temporal difference (TD) learning is a well-known approach for training automated players in board games with a limited number of potential states through autonomous play. Because of its directness, TD learning has become widespread, but certain critical difficulties must be solved in order for it to be effective. It is impractical to train an artificial intelligence (AI) agent against a random player since it takes millions of games for the agent to learn to play intelligently. Training the agent against a methodical player, on the other hand, is not an option owing to a lack of exploration. This article describes and examines a variety of hybrid training procedures for a TD-based automated player that combines randomness with specified plays in a predetermined ratio. We provide simulation results for the famous tic-tac-toe and Connect-4 board games, in which one of the studied training strategies significantly surpasses the other options. On average, it takes fewer than 100,000 games of training for an agent taught using this approach to act as a flawless player in tic-tac-toe.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords