IEEE Access (Jan 2025)
A Cross-Dimensional Attention Mechanism for Pedestrian Trajectory Forecasting
Abstract
Forecasting pedestrian trajectories is crucial for autonomous driving systems but remains challenging due to complex spatial and temporal interactions. Most existing methods model these interactions separately; for example, they capture temporal features and then pass this information to a spatial interaction model. These sequential methods hinder communication between the two dimensions, reducing forecasting accuracy. To address this limitation, we propose a novel method called CAM (Cross-Dimensional Attention Mechanism for Pedestrian Trajectory Forecasting). CAM independently captures temporal and spatial features and facilitates effective communication between them. Specifically, we utilize graph attention networks to capture spatial features and transformers to capture temporal features. The cross-dimensional attention mechanism enables features encoded in one dimension to query and retrieve relevant information from the other dimension. This mechanism allows features in both dimensions to interact effectively, making pedestrian trajectory forecasting easier. We evaluated our method on two well-known public datasets, ETH and UCY. The results show that our method improves forecasting accuracy, with the average displacement error (ADE) of 0.21 and final displacement error (FDE) of 0.41, improving the ADE and FDE by 16.0% and 6.8%, respectively.
Keywords