PAC Optimal Exploration Algorithm Named RMAX-KNN

LI Chao, MEN Changqian, WANG Wenjian

doi:10.3778/j.issn.1673-9418.1905023

Jisuanji kexue yu tansuo (Mar 2020)

PAC Optimal Exploration Algorithm Named RMAX-KNN

LI Chao, MEN Changqian, WANG Wenjian

Affiliations

LI Chao, MEN Changqian, WANG Wenjian: 1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China 2. Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of Education, Taiyuan 030006, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.1905023
Journal volume & issue: Vol. 14, no. 3
pp. 513 – 526

Abstract

Read online

The balance of exploration and exploitation is one of the focuses of reinforcement learning research. The exploration helps the agent understand the environment more comprehensively and make better decisions while the exploitation helps the agent make current optimal decisions based on its current cognition of the environment. At present, most of the exploration algorithms are only associated with the value function, regardless of the agent’s current cognitive level of the environment, so the efficiency of the exploration is extremely low. Aiming at solving this problem, this paper proposes a reinforcement learning algorithm named RMAX-KNN (reward maximum K-nearest neighbor) based on the adaptive discretization of the state space. The algorithm rewrites the value function according to the level of discretization of the state space and makes the agent explore the environment reasonably, gradually achieving the adaptive discretization of the environmental state space. By combining the exploration with the discretization of the environmental state space, the RMAX-KNN algorithm gradually raises the cognitive level of the agent in terms of the environment and increases the efficiency of exploration. At the same time, this algorithm is proven to be a probably approximately correct (PAC) optimal exploration algorithm theoretically. The simulation experiments in the Benchmark domains show that the RMAX-KNN algorithm can achieve the adaptive discretization of the environmental state space while exploring the environment and developing the optimal strategy.

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords