Time-Varying Preference Bandits for Robot Behavior Personalization

Chanwoo Kim; Joonhyeok Lee; Eunwoo Kim; Kyungjae Lee

doi:10.3390/app142311002

Applied Sciences (Nov 2024)

Time-Varying Preference Bandits for Robot Behavior Personalization

Chanwoo Kim,
Joonhyeok Lee,
Eunwoo Kim,
Kyungjae Lee

Affiliations

Chanwoo Kim: Department of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of Korea
Joonhyeok Lee: Department of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of Korea
Eunwoo Kim: School of Computer Science and Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
Kyungjae Lee: Department of Statistics, Korea University, Seoul 02841, Republic of Korea

DOI: https://doi.org/10.3390/app142311002
Journal volume & issue: Vol. 14, no. 23
p. 11002

Abstract

Read online

Robots are increasingly employed in diverse services, from room cleaning to coffee preparation, necessitating an accurate understanding of user preferences. Traditional preference-based learning allows robots to learn these preferences through iterative queries about desired behaviors. However, these methods typically assume static human preferences. In this paper, we challenge this static assumption by considering the dynamic nature of human preferences and introduce the discounted preference bandit method to manage these changes. This algorithm adapts to evolving human preferences and supports seamless human–robot interaction through effective query selection. Our approach outperforms existing methods in time-varying scenarios across three key performance metrics.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords