IEEE Access (Jan 2024)
<italic>BEVSeg2GTA</italic>: Joint Vehicle Segmentation and Graph Neural Networks for Ego Vehicle Trajectory Prediction in Bird’s-Eye-View
Abstract
Predicting the trajectory of the ego vehicle is a critical task for autonomous vehicles. Even though traffic regulations have defined boundaries, various behaviors of the agents in real-life situations introduce complexities that are hard to capture comprehensively. This has led to a rising curiosity in ego vehicle trajectory prediction based on learning techniques. In this paper, we introduce BEVSeg2GTA (Bird’s-Eye-View Joint Vehicle Segmentation and Graph Neural Network Trajectory Prediction), a novel approach that aims to forecast trajectories by treating perception and trajectory prediction as interconnected elements of a single system. By integrating these tasks, we demonstrate the possibility of improving perception accuracy and trajectory prediction error. Initially, an encoder-decoder transformer-based deep network is employed to convert the multi-view camera images to a Bird’s-Eye-View representation followed by semantic segmentation of crucial agents, including the ego vehicle, other vehicles, and pedestrians within the scene. Integrating state-of-the-art backbone (such as EfficientNet) facilitates the extraction of strong features, which are used to construct a graph wherein a node represents each object within the scene. Subsequently, the connections of these nodes are established by a k-Nearest Neighbors algorithm based on the distance metric. Further, the node and image features are fed into a Graph Neural Network to learn the complex relationships between agents in a spatial context. Finally, the Graph Neural Network learned features are passed to a Spatio-Temporal Probabilistic Network to predict the ego vehicle’s future trajectory accurately. The proposed framework, BEVSeg2GTA, has been extensively evaluated on nuScenes datasets. The results demonstrate that the proposed method improves the state-of-the-art performance.
Keywords