Journal of Intelligent and Connected Vehicles (Oct 2022)
Development and testing of an image transformer for explainable autonomous driving systems
Abstract
Purpose – Perception has been identified as the main cause underlying most autonomous vehicle related accidents. As the key technology in perception, deep learning (DL) based computer vision models are generally considered to be black boxes due to poor interpretability. These have exacerbated user distrust and further forestalled their widespread deployment in practical usage. This paper aims to develop explainable DL models for autonomous driving by jointly predicting potential driving actions with corresponding explanations. The explainable DL models can not only boost user trust in autonomy but also serve as a diagnostic approach to identify any model deficiencies or limitations during the system development phase. Design/methodology/approach – This paper proposes an explainable end-to-end autonomous driving system based on “Transformer,” a state-of-the-art self-attention (SA) based model. The model maps visual features from images collected by onboard cameras to guide potential driving actions with corresponding explanations, and aims to achieve soft attention over the image’s global features. Findings – The results demonstrate the efficacy of the proposed model as it exhibits superior performance (in terms of correct prediction of actions and explanations) compared to the benchmark model by a significant margin with much lower computational cost on a public data set (BDD-OIA). From the ablation studies, the proposed SA module also outperforms other attention mechanisms in feature fusion and can generate meaningful representations for downstream prediction. Originality/value – In the contexts of situational awareness and driver assistance, the proposed model can perform as a driving alarm system for both human-driven vehicles and autonomous vehicles because it is capable of quickly understanding/characterizing the environment and identifying any infeasible driving actions. In addition, the extra explanation head of the proposed model provides an extra channel for sanity checks to guarantee that the model learns the ideal causal relationships. This provision is critical in the development of autonomous systems.
Keywords