The Group Interaction Field for Learning and Explaining Pedestrian Anticipation
Xueyang Wang,
Xuecheng Chen,
Puhua Jiang,
Haozhe Lin,
Xiaoyun Yuan,
Mengqi Ji,
Yuchen Guo,
Ruqi Huang,
Lu Fang
Affiliations
Xueyang Wang
Sigma Laboratory, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China; Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China; Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
Xuecheng Chen
Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
Puhua Jiang
Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
Haozhe Lin
Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China; Department of Automation, Tsinghua University, Beijing 100084, China
Xiaoyun Yuan
Sigma Laboratory, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China; Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
Mengqi Ji
Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
Yuchen Guo
Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
Ruqi Huang
Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
Lu Fang
Sigma Laboratory, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China; Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China; Zhejiang Future Technology Institute, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, Jiaxing 314033, China; Corresponding author.
Anticipating others’ actions is innate and essential in order for humans to navigate and interact well with others in dense crowds. This ability is urgently required for unmanned systems such as service robots and self-driving cars. However, existing solutions struggle to predict pedestrian anticipation accurately, because the influence of group-related social behaviors has not been well considered. While group relationships and group interactions are ubiquitous and significantly influence pedestrian anticipation, their influence is diverse and subtle, making it difficult to explicitly quantify. Here, we propose the group interaction field (GIF), a novel group-aware representation that quantifies pedestrian anticipation into a probability field of pedestrians’ future locations and attention orientations. An end-to-end neural network, GIFNet, is tailored to estimate the GIF from explicit multidimensional observations. GIFNet quantifies the influence of group behaviors by formulating a group interaction graph with propagation and graph attention that is adaptive to the group size and dynamic interaction states. The experimental results show that the GIF effectively represents the change in pedestrians’ anticipation under the prominent impact of group behaviors and accurately predicts pedestrians’ future states. Moreover, the GIF contributes to explaining various predictions of pedestrians’ behavior in different social states. The proposed GIF will eventually be able to allow unmanned systems to work in a human-like manner and comply with social norms, thereby promoting harmonious human–machine relationships.