Model-Free Distributed Reinforcement Learning State Estimation of a Dynamical System Using Integral Value Functions

Babak Salamat; Gerhard Elsbacher; Andrea M. Tonello; Lenz Belzner

doi:10.1109/OJCSYS.2023.3250089

IEEE Open Journal of Control Systems (Jan 2023)

Model-Free Distributed Reinforcement Learning State Estimation of a Dynamical System Using Integral Value Functions

Babak Salamat,
Gerhard Elsbacher,
Andrea M. Tonello,
Lenz Belzner

Affiliations

Babak Salamat: ORCiD; AImotion Institute, Technische Hochschule Ingolstadt, Ingolstadt, Germany
Gerhard Elsbacher: ORCiD; AImotion Institute, Technische Hochschule Ingolstadt, Ingolstadt, Germany
Andrea M. Tonello: ORCiD; Institute of Embedded Systems, Alpen-Adria University Klagenfurt, Klagenfurt, Austria
Lenz Belzner: ORCiD; AImotion Institute, Technische Hochschule Ingolstadt, Ingolstadt, Germany

DOI: https://doi.org/10.1109/OJCSYS.2023.3250089
Journal volume & issue: Vol. 2
pp. 70 – 78

Abstract

Read online

One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying DL techniques to state estimation problems. However, the process noise is absent which seems to indicate that the point-mass target must be non-maneuvering, as process noise is typically as significant as the measurement noise for tracking maneuvering targets. In this paper, we propose a continuous-time (CT) model-free or model-building distributed reinforcement learning estimator (DRLE) using an integral value function in sensor networks. The DRLE algorithm is capable of learning an optimal policy from a neural value function that aims to provide the estimation of a target point mass. The proposed estimator consists of two high pass consensus filters in terms of weighted measurements and inverse-covariance matrices and a critic reinforcement learning mechanism for each node in the network. The efficiency of the proposed DRLE is shown by a simulation experiment of a network of underactuated vertical takeoff and landing aircraft with strong input coupling. The experiment highlights two advantages of DRLE: i) it does not require the dynamic model to be known, and ii) it is an order of magnitude faster than the state-dependent Riccati equation (SDRE) baseline.

Published in IEEE Open Journal of Control Systems

ISSN: 2694-085X (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Mechanical engineering and machinery: Control engineering systems. Automatic machinery (General)
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9552933

About the journal

Abstract

Keywords