Mathematics (Dec 2023)

Benchmarking Perception to Streaming Inputs in Vision-Centric Autonomous Driving

  • Tianshi Jin,
  • Weiping Ding,
  • Mingliang Yang,
  • Honglin Zhu,
  • Peisong Dai

DOI
https://doi.org/10.3390/math11244976
Journal volume & issue
Vol. 11, no. 24
p. 4976

Abstract

Read online

In recent years, vision-centric perception has played a crucial role in autonomous driving tasks, encompassing functions such as 3D detection, map construction, and motion forecasting. However, the deployment of vision-centric approaches in practical scenarios is hindered by substantial latency, often deviating significantly from the outcomes achieved through offline training. This disparity arises from the fact that conventional benchmarks for autonomous driving perception predominantly conduct offline evaluations, thereby largely overlooking the latency concerns prevalent in real-world deployment. Although a few benchmarks have been proposed to address this limitation by introducing effective evaluation methods for online perception, they do not adequately consider the intricacies introduced by the complexity of input information streams. To address this gap, we propose the Autonomous driving Streaming I/O (ASIO) benchmark, aiming to assess the streaming input characteristics and online performance of vision-centric perception in autonomous driving. To facilitate this evaluation across diverse streaming inputs, we initially establish a dataset based on the CARLA Leaderboard. In alignment with real-world deployment considerations, we further develop evaluation metrics based on information complexity specifically tailored for streaming inputs and streaming performance. Experimental results indicate significant variations in model performance and ranking under different major camera deployments, underscoring the necessity of thoroughly accounting for the influences of model latency and streaming input characteristics during real-world deployment. To enhance streaming performance consistently across distinct streaming input features, we introduce a backbone switcher based on the identified streaming input characteristics. Experimental validation demonstrates its efficacy in perpetually improving streaming performance across varying streaming input features.

Keywords