Jisuanji kexue yu tansuo (Jan 2021)

Video Target Detection Based on Improved YOLOV3 Algorithm

  • SONG Yanyan, TAN Li, MA Zihao, REN Xueping

DOI
https://doi.org/10.3778/j.issn.1673-9418.2003008
Journal volume & issue
Vol. 15, no. 1
pp. 163 – 172

Abstract

Read online

Pedestrian detection in monitoring has complex backgrounds, multiple target scales and poses, and occlusion between people and surrounding objects. As a result, the YOLOV3 algorithm is inaccurate in detecting some targets, which may result in false detection, missed detection, or repeated detection. Therefore, on the basis of YOLOV3's network, using the residual structure idea, the shallow and deep features are upsampled and fused to obtain 104×104 scale detection layers. And the size of the bounding box clustered by the K-means algorithm is applied to the network layer of each scale, which increases the sensitivity of the network to multi-scale and multi-pose targets and improves the detection effect. At the same time, the YOLOV3 loss function is updated using the repulsion loss of the prediction frame to other surrounding targets, so that the prediction frame is closer to the correct target, away from the wrong target. In addition, the false detection rate of the model is reduced, so as to improve the detection effect of mutual occlusion between the targets. The experimental results prove that the proposed network model has better detection effect than the YOLOV3 algorithm on the MOT16 dataset, which proves the effectiveness of the method.

Keywords