Effects of Sampling and Prediction Horizon in Reinforcement Learning

Pavel Osinenko; Dmitrii Dobriborsci

doi:10.1109/ACCESS.2021.3112498

IEEE Access (Jan 2021)

Effects of Sampling and Prediction Horizon in Reinforcement Learning

Pavel Osinenko,
Dmitrii Dobriborsci

Affiliations

Pavel Osinenko: ORCiD; Skolkovo Institute of Science and Technology, Moscow, Russia
Dmitrii Dobriborsci: ORCiD; Skolkovo Institute of Science and Technology, Moscow, Russia

DOI: https://doi.org/10.1109/ACCESS.2021.3112498
Journal volume & issue: Vol. 9
pp. 127611 – 127618

Abstract

Read online

Plain reinforcement learning (RL) may be prone to loss of convergence, constraint violation, unexpected performance, etc. Commonly, RL agents undergo extensive learning stages to achieve proper functionality. This is in contrast to classical control algorithms, which are typically model-based. A direction of research is the fusion of RL with such algorithms, especially model-predictive control (MPC). This, however, introduces new hyper-parameters related to the prediction horizon. Furthermore, RL is usually concerned with Markov decision processes. Nevertheless, most of the real environments are not time-discrete. The factual physical setting of RL consists of a digital agent and a time-continuous dynamical system. There is thus, in fact, yet another hyper-parameter – the agent sampling time. In this paper, we investigate the effects of prediction horizon and sampling of two hybrid RL-MPC agents in a case study with a mobile robot parking, which is, in turn, a canonical control problem. We benchmark the agents with a simple variant of MPC. The sampling showed a “sweet spot” behavior, whereas the RL agents demonstrated merits at shorter horizons.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords