Modeling, Identification and Control (Jan 2022)

Model-free Control of Partially Observable Underactuated Systems by pairing Reinforcement Learning with Delay Embeddings

  • Martinius Knudsen,
  • Sverre Hendseth,
  • Gunnar Tufte,
  • Axel Sandvig

DOI
https://doi.org/10.4173/mic.2022.1.1
Journal volume & issue
Vol. 43, no. 1
pp. 1 – 8

Abstract

Read online

Partial observability is a problem in control design where the measured states are insufficient in describing the systems trajectory. Interesting real-world systems often exhibit nonlinear behavior and noisy, continuous-valued states that are poorly described by first principles, and which are only partially observable. If partial observability can be overcome, these conditions suggest the use of reinforcement learning (RL). In this paper we tackle the problem of controlling highly nonlinear underactuated dynamical systems, without a model, and with insufficient observations to infer the systems internal states. We approach the problem by creating a time-delay embedding from a subset of the observed state and apply RL on this embedding rather than the original state manifold. We find that delay embeddings work well with learning based methods, as such methods do not require a precise description of the systems state. Instead, RL learns to map any observation to appropriate action (determined by a reward function), even if these observations do not lie on the original geometric state manifold.

Keywords