IEEE Access (Jan 2022)
Learning an Accurate State Transition Dynamics Model by Fitting Both a Function and its Derivative
Abstract
Learning accurate state transition dynamics model in a sample-efficient way is important to predict the future states from the current states and actions of a system both accurately and efficiently in model-based reinforcement learning for many robotic applications. This study proposes a sample-efficient learning approach that can accurately learn a state transition dynamics model by fitting both the predicted next states and their derivatives. The derivatives of the feedforward neural network output (next states) with respect to the inputs (current states and actions) are computed using chain rules. In addition, the effect of the activation functions on the learning derivatives are illustrated via sum of elementary sine functions example and the values are compared with various other activation functions with respect to accuracy. The proposed learning approach exhibits significant improvement in accuracy for both one-step and multi-step prediction cases with a six-degree-of-freedom manipulation robot (UR-10) in both simulation and real environments.
Keywords