Applied Sciences (Nov 2023)

Trajectory Prediction with Attention-Based Spatial–Temporal Graph Convolutional Networks for Autonomous Driving

  • Hongbo Li,
  • Yilong Ren,
  • Kaixuan Li,
  • Wenjie Chao

DOI
https://doi.org/10.3390/app132312580
Journal volume & issue
Vol. 13, no. 23
p. 12580

Abstract

Read online

Accurate and reliable trajectory prediction is crucial for autonomous vehicles to achieve safe and efficient operation. Vehicles perceive the historical trajectories of moving objects and make predictions of behavioral intentions for a future period of time. With the predicted trajectories of moving objects such as obstacle vehicles, pedestrians, and non-motorized vehicles as inputs, self-driving vehicles can make more rational driving decisions and plan more reasonable and safe vehicle motion behaviors. However, due to traffic environments such as intersection scenes with highly interdependent and dynamic attributes, the task of motion anticipation becomes challenging. Existing works focus on the mutual relationships among vehicles while ignoring other potential essential interactions such as vehicle–traffic rules. These studies have not yet deeply explored the intensive learning of interactions between multi-agents, which may result in evaluation deviations. Aiming to meet these issues, we have designed a novel framework, namely trajectory prediction with attention-based spatial–temporal graph convolutional networks (TPASTGCN). In our proposal, the multi-agent interaction mechanisms, including vehicle–vehicle and vehicle–traffic rules, are meticulously highlighted and integrated into one homogeneous graph by transferring the time-series data of traffic lights into the spatial–temporal domains. Through integrating the attention mechanism into the adjacency matrix, we effectively learn the different strengths of interactive association and improve the model’s ability to capture critical features. Simultaneously, we construct a hierarchical structure employing the spatial GCN and temporal GCN to extract the spatial dependencies of traffic networks. Profiting from the gated recurrent unit (GRU), the scene context in temporal dimensions is further attained and enhanced with the encoder. In such a way, the GCN and GRU networks are fused as a features extractor module in the proposed framework. Finally, the future potential trajectories generation tasks are performed by another GRU network. Experiments on real-world datasets demonstrate the superior performance of the scheme compared with several baselines.

Keywords