Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games

J. M. Meylahn; L. Janssen

doi:10.1155/2022/4830491

Complexity (Jan 2022)

Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games

J. M. Meylahn,
L. Janssen

Affiliations

J. M. Meylahn: Department of Applied Mathematics
L. Janssen: Faculty of Science

DOI: https://doi.org/10.1155/2022/4830491
Journal volume & issue: Vol. 2022

Abstract

Read online

We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Hindawi-Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/8503

About the journal