Egyptian Informatics Journal (Jun 2024)

Egocentric intention object prediction based on a human-like manner

  • Zongnan Ma,
  • Jingru Men,
  • Fuchun Zhang,
  • Zhixiong Nan

Journal volume & issue
Vol. 26
p. 100482

Abstract

Read online

This paper deals with the problem of egocentric intention object prediction, which requires a model to produce a probability map for the possible locations of human intention objects, based on an egocentric image from daily activities. Existing methods typically rely on visible indications (e.g., visual attention feature and human hand feature) to predict intention objects, assuming that intention object selection follows a bottom-up approach. However, in human decision-making on intention objects, a top-down cognitive process also occurs invisibly, analyzing object candidates’ relevance to the ongoing activity (e.g., object function’s alignment with activity goals) and the overall scene (e.g., semantic context and object distances). Based on this idea, this paper introduces a multi-modal fusion mechanism that considers both visible bottom-up cues and invisible top-down cues for predicting intention objects in a human-like manner. Additionally, this study pioneers the use of a multi-depth supervision mechanism in human intention object prediction. Our method surpasses eight baseline approaches in experiments on two public datasets, as confirmed by ablation studies validating our mechanisms’ effectiveness.

Keywords