Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system

Emmanuel Okafor; Daniel Udekwe; Yusuf Ibrahim; Muhammed Bashir Mu'azu; Ekene Gabriel Okafor

doi:10.1080/24751839.2020.1833137

Journal of Information and Telecommunication (Apr 2021)

Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system

Emmanuel Okafor,
Daniel Udekwe,
Yusuf Ibrahim,
Muhammed Bashir Mu'azu,
Ekene Gabriel Okafor

Affiliations

Emmanuel Okafor: Ahmadu Bello University
Daniel Udekwe: Ahmadu Bello University
Yusuf Ibrahim: Ahmadu Bello University
Muhammed Bashir Mu'azu: Ahmadu Bello University
Ekene Gabriel Okafor: Air Force Institute of Technology, Nigerian Air Force Base

DOI: https://doi.org/10.1080/24751839.2020.1833137
Journal volume & issue: Vol. 5, no. 2
pp. 179 – 196

Abstract

Read online

The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ($B\& P$) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.

Published in Journal of Information and Telecommunication

ISSN: 2475-1839 (Print); 2475-1847 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.tandfonline.com/journals/tjit

About the journal

Abstract

Keywords