Tongxin xuebao (Oct 2023)
Lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution
Abstract
Aiming at the problems of discontinuous segmentation of thin strip objects that were easy to blend into the surrounding background and a large number of model parameters in the semantic segmentation algorithm of traffic scenes, a lightweight Transformer traffic scene semantic segmentation algorithm integrating multi-scale depth convolution was proposed.First, a multi-scale strip feature extraction module (MSEM) was constructed based on deep convolution to enhance the representation ability of thin strip target features at different scales.Secondly, a spatial detail auxiliary module (SDAM) was designed using the convolutional inductive bias feature in the shallow network to compensate for the loss of deep spatial detail information to optimize object edge segmentation.Finally, an asymmetric encoding-decoding network based on the Transformer-CNN framework (TC-AEDNet) was proposed.The encoder combined Transformer and CNN to alleviate the loss of detail information and reduce the amount of model parameters; while the decoder adopted a lightweight multi-level feature fusion design to further model the global context.The proposed algorithm achieves the mean intersection over union (mIoU) of 78.63% and 81.06% respectively on the Cityscapes and CamVid traffic scene public datasets.It can achieve a trade-off between segmentation accuracy and model size in traffic scene semantic segmentation and has a good application prospect.