IEEE Access (Jan 2022)

Multimodal Engagement Prediction in Multiperson Human–Robot Interaction

  • Ahmed A. Abdelrahman,
  • Dominykas Strazdas,
  • Aly Khalifa,
  • Jan Hintz,
  • Thorsten Hempel,
  • Ayoub Al-Hamadi

DOI
https://doi.org/10.1109/ACCESS.2022.3182469
Journal volume & issue
Vol. 10
pp. 61980 – 61991

Abstract

Read online

The ability to measure the engagement level of humans interacting with robots paves the way towards intuitive and safe human-robot interaction. Recent approaches achieve reasonable progress in predicting human engagement in physically situated environments. However, engagement estimation is still a challenging problem especially in an open-world environment due to the difficulty of creating and monitoring a variety of human social cues in real-time. Furthermore, the interactions may involve a group of subjects interacting simultaneously with the robot, which increases the prediction complexity. In this paper, we design a real-time engagement estimation system for humans interacting with robots with generalization capability. We propose to estimate engagement using a three-stage approach based on a combination of learning-based and rule-based approaches. Firstly, state-of-the-art deep learning methods are used to extract engagement features from input frames. Then, a simple neural network is used to estimate the focus of attention score by incorporating gaze and head pose features and assigning this score to all subjects in the scene using a face recognition algorithm. Finally, a rule-based classification approach is used to predict the engagement state of the subject to initiate/terminate the interaction with the robot. To effectively evaluate our system, we access our approach for each phase separately. Additionally, we use an online evaluation study in which subjects are allowed to interact freely with an industrial robot. Our model achieves an average of 96%, 90%, and 93% precision, recall, and F-score respectively.

Keywords