Jisuanji kexue yu tansuo (Dec 2023)
Object Detection Algorithm for 3D Coordinate Attention Path Aggregation Network
Abstract
In practical industrial applications, YOLO series algorithms are not accurate enough to locate the object prediction boxes, and it is difficult to apply to realistic scenarios with high positioning requirements. The object detection algorithm YOLO-T of the three-dimensional coordinate attention path aggregation network is proposed. Firstly, the shortcut connection method is used to fuse the cross-layer features of the path aggregation feature pyramid to retain its shallow semantic information. Secondly, based on the coordinate attention mechanism, a three-dimensional coordinate attention (TDCA) model is proposed, which is used to pay attention weight to the features in the path aggregation feature pyramid (TPA-FPN (TDCA path aggregation feature pyramid networks)) to retain useful information and remove redundant information. Thirdly, the loss matrix calculation method of SimOTA (simplify optimal transport assignment) in the label allocation strategy is improved, which enhances the performance while ensuring no loss of efficiency. Finally, Depthwise Separable Conv is used to improve the convolution module in the backbone feature extraction network to make the model lightweight. Experimental results show that the detection accuracy [email protected] of the algorithm is 1.3 percentage points higher than that of YOLOX-S on the PASCAL VOC2007+2012 dataset, and the [email protected]:0.95 is improved by 3.8 percentage points. The average detection accuracy [email protected]:0.95 is improved by 2.4 percentage points on the COCO2017 dataset.
Keywords