IET Intelligent Transport Systems (Sep 2022)

Multi‐future Transformer: Learning diverse interaction modes for behaviour prediction in autonomous driving

  • Baotian He,
  • Yibing Li

DOI
https://doi.org/10.1049/itr2.12207
Journal volume & issue
Vol. 16, no. 9
pp. 1249 – 1267

Abstract

Read online

Abstract Predicting the future behaviour of neighbouring agents is crucial for autonomous driving. This task is challenging, largely because of the diverse unobservable intent of each agent which is further complicated by the complex interaction possibilities between them. The authors propose a multi‐future Transformer framework that implicitly models the multi‐modal joint distribution by capturing the diverse interaction modes of the scene. To this end, a parallel interaction module is constructed, whereby each interaction block learns the joint agent–agent and agent–map interactions for possible future evolution. The model can perform likelihood estimation from the perspective of both the joint distribution of the scene and marginal distribution of each agent. Combined with the proposed scene‐level winner‐take‐all loss strategy complementary to the model architecture, the best performance is achieved for both target agent prediction and scene prediction tasks in a single model. To better utilise the scene context, comprehensive control experiments were conducted highlighting the importance of fine‐grained scene representation with content‐adaptive aggregation and late fusion of semantic attributes. The method, evaluated on the popular Argoverse forecasting dataset, outperformed previous methods while maintaining low model complexity.