Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Xiali Li; Zhengyu Lv; Licheng Wu; Yue Zhao; Xiaona Xu

doi:10.1155/2020/4708075

Complexity (Jan 2020)

Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Xiali Li,
Zhengyu Lv,
Licheng Wu,
Yue Zhao,
Xiaona Xu

Affiliations

Xiali Li: School of Information and Engineering, Minzu University of China, Beijing 100081, China
Zhengyu Lv: School of Information and Engineering, Minzu University of China, Beijing 100081, China
Licheng Wu: School of Information and Engineering, Minzu University of China, Beijing 100081, China
Yue Zhao: School of Information and Engineering, Minzu University of China, Beijing 100081, China
Xiaona Xu: School of Information and Engineering, Minzu University of China, Beijing 100081, China

DOI: https://doi.org/10.1155/2020/4708075
Journal volume & issue: Vol. 2020

Abstract

Read online

In this study, hybrid state-action-reward-state-action (SARSAλ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSAλ and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program’s learning efficiency and understanding ability for Tibetan Jiu chess.

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/8503

About the journal