IEEE Access (Jan 2023)

Policy Optimization for Waste Crane Automation From Human Preferences

  • Yuhwan Kwon,
  • Hikaru Sasaki,
  • Terushi Hirabayashi,
  • Kaoru Kawabata,
  • Takamitsu Matsubara

DOI
https://doi.org/10.1109/ACCESS.2023.3331373
Journal volume & issue
Vol. 11
pp. 126524 – 126541

Abstract

Read online

This research introduces a novel approach to optimizing control policies for waste cranes operating at waste-to-energy plants. Although previous methods forced people to define evaluation functions for automation, such design works in actual environments can often be challenging due to limited sensors and design difficulties. This paper aims to establish a methodology that achieves automation by having people respond to interactive pairwise comparison queries, which is relatively simple compared to design work. On the other hand, considering such automation, it becomes imperative to address the increased sample cost associated with slow crane operation and the complexities of decision-making due to waste inhomogeneity. Our proposed Preferential Bayesian Policy Optimization (PBPO) optimizes control policies with a small number of queries using Preference-based Bayesian optimization (PbBO) and mitigates the difficulty of decision-making by providing human evaluators who have an option to skip queries. We also incorporate a query synthesis mechanism to enhance query efficiency that generates a new preference relation from the skipped queries. PBPO’s effectiveness was validated with a scattering task employed in previous studies. Experimental results with simulated evaluators show the effectiveness of the PBPO and query synthesis. Furthermore, results with actual human evaluators indicate that our proposed method performs as well as the Bayesian optimization (BO) method, which requires an evaluation function.

Keywords