IEEE Access (Jan 2024)

A Novel Policy Distillation With WPA-Based Knowledge Filtering Algorithm for Efficient Industrial Robot Control

  • Gilljong Shin,
  • Seongjin Yun,
  • Won-Tae Kim

DOI
https://doi.org/10.1109/ACCESS.2024.3483970
Journal volume & issue
Vol. 12
pp. 154514 – 154525

Abstract

Read online

Advanced factories strongly need autonomous control methods of manufacturing robots flexibly responding to various requirements of the operational engineers. Deep reinforcement learning is more promising technology to support the dynamic factory situations than the legacy static robot control technologies are. The machine learning model experiences the computing resource limitation of industrial robots working the complex jobs of intelligent operations including machine vision, action planning and human collaborations. Policy distillation is a kind of model compression schemes of deep reinforcement learning by means of teacher-student model which make a pre-trained teacher model transfer its knowledge to structurally simplified student models in order to enhance the computing efficiency. However, it may have some problems of anomalous knowledge and abnormal robot movements in case the teacher model has a local optimal policy. In this paper, we propose a novel policy distillation with win probability added (WPA) based knowledge filtering algorithm for efficient industrial robot control. The proposed mechanism adopts the WPA method in sports analytics to evaluate the knowledge extracted from the teacher model and to filter bad knowledge out. The filtered knowledge is reconstructed with interpolation to train the student model with high-quality data. We perform the well-designed experiments, which show 11% compression enhancement and 5% reduction in execution time and steps required for the tasks, using a robot arm and a UGV as test environments.

Keywords