IEEE Access (Jan 2023)
Spatially Recalibrated Convolutional Neural Network for Vehicle Type Recognition
Abstract
Vehicle Type Recognition (VTR) is a significant segment within the vehicle recognition field. It provides an alternative identification method aside from license plate recognition and vehicle make and model recognition. Most of the recent studies use Convolutional Neural Networks (CNNs) to perform VTR. However, the feature responses obtained from CNNs are not recalibrated based on saliency and this hinders the classification performance. In this study, we propose a Spatial Attention Module (SAM) that is compatible with the existing CNNs. We aim to exploit the spatial relationship between feature responses by scaling them according to their relative importance to increase classification accuracy. The results reveal the exceptional performance of SAM on Beijing Institute of Technology (BIT)-Vehicle, Stanford Cars and web-nature Comprehensive Cars (CompCarsWeb) with 96.92%, 84.48% and 95.96% accuracies, respectively. A qualitative inspection of the learned feature embedding suggests the high cohesivity of the features within the group. Furthermore, an ablation study is conducted to justify the hyperparameters of choice for SAM. SAM is also modular where it is highly compatible with other CNNs and it leads to considerable performance improvement. A comparison with existing attention modules suggests our proposal prevails in the VTR application. The inference times of 1 ms and 10 ms for CaffeNet-SAM and ResNet-SAM also make them suitable for real-time classification tasks.
Keywords