IEEE Access (Jan 2024)

Smart Task Assistance Through Deep Learning-Based Visual Guidance in Asymmetric XR Remote Collaboration

  • Hongju Moon,
  • Seunghyeon Yu,
  • Jae Yeol Lee

DOI
https://doi.org/10.1109/ACCESS.2024.3455014
Journal volume & issue
Vol. 12
pp. 126899 – 126914

Abstract

Read online

In the context of the emerging contactless era, many previous studies have been paid attention to remote collaboration. Although remote collaboration offers the distinct advantage of enabling cooperative initiatives independent of geographical limitations, the effectiveness of communication and the depth of shared understanding are limited compared to collaborative activities conducted in a face-to-face context. These limitations can be overcome through the utilization of extended reality (XR), encompassing both augmented reality (AR) and virtual reality (VR). Although there are previous studies for integrating AR and VR for asymmetric collaboration, supporting visual guidance for effective task assistance remains still challenging. This study proposes an asymmetric XR-based remote collaboration approach to supporting smart task assistance by reconstructing the 3D virtual space of the local AR environment as a digital twin of the real-world spatial reference and by providing deep learning-based visual guidance, in addition to multimodal gestures such as hand gestures and eye gazing. Thus, a remote VR expert can comprehensively understand and explore the local working situation and interact with the remote worker with various interaction metaphors. Thus, the VR expert can guide the remote AR worker to perform their tasks more effectively through the step-by-step instructions with deep learning-based visual cues and annotations. A user study was conducted to explore the advantages of deep learning-based visual guidance for task assistance for asymmetric XR remote collaboration. The results showed that collaborating with deep learning-based visual guidance improved task execution time and some criteria concerning usability and workload. In addition, social presence was higher when eye gazing was provided. The findings can help design better XR-enabled remote collaboration and provide new directions for future research.

Keywords