MDSTF: a multi-dimensional spatio-temporal feature fusion trajectory prediction model for autonomous driving

Xing Wang; Zixuan Wu; Biao Jin; Mingwei Lin; Fumin Zou; Lyuchao Liao

doi:10.1007/s40747-024-01490-4

Complex & Intelligent Systems (Jun 2024)

MDSTF: a multi-dimensional spatio-temporal feature fusion trajectory prediction model for autonomous driving

Xing Wang,
Zixuan Wu,
Biao Jin,
Mingwei Lin,
Fumin Zou,
Lyuchao Liao

Affiliations

Xing Wang: College of Computer and Cyber Security, Fujian Normal University
Zixuan Wu: College of Computer and Cyber Security, Fujian Normal University
Biao Jin: College of Computer and Cyber Security, Fujian Normal University
Mingwei Lin: College of Computer and Cyber Security, Fujian Normal University
Fumin Zou: Fujian Key Laboratory of Automotive Electronic and Electrical Drive Technology, Fujian University of Technology
Lyuchao Liao: Fujian Key Laboratory of Automotive Electronic and Electrical Drive Technology, Fujian University of Technology

DOI: https://doi.org/10.1007/s40747-024-01490-4
Journal volume & issue: Vol. 10, no. 5
pp. 6647 – 6665

Abstract

Read online

Abstract In the field of autonomous driving, trajectory prediction of traffic agents is an important and challenging problem. Fully capturing the complex spatio-temporal features in trajectory data is crucial for accurate trajectory prediction. This paper proposes a trajectory prediction model called multi-dimensional spatio-temporal feature fusion (MDSTF), which integrates multi-dimensional spatio-temporal features to model the trajectory information of traffic agents. In the spatial dimension, we employ graph convolutional networks (GCN) to capture the local spatial features of traffic agents, spatial attention mechanism to capture the global spatial features, and LSTM combined with spatial attention to capture the full-process spatial features of traffic agents. Subsequently, these three spatial features are fused using a gate fusion mechanism. Moreover, during the modeling of the full-process spatial features, LSTM is capable of capturing short-term temporal dependencies in the trajectory information of traffic agents. In the temporal dimension, we utilize a Transformer-based encoder to extract long-term temporal dependencies in the trajectory information of traffic agents, which are then fused with the short-term temporal dependencies captured by LSTM. Finally, we employ two temporal convolutional networks (TCN) to predict trajectories based on the fused spatio-temporal features. Experimental results on the ApolloScape trajectory dataset demonstrate that our proposed method outperforms state-of-the-art methods in terms of weighted sum of average displacement error (WSADE) and weighted sum of final displacement error (WSFDE) metrics. Compared to the best baseline model (S2TNet), our method achieves reductions of 4.37% and 6.23% respectively in these metrics.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords