Applied Sciences (Jul 2023)
FusionTrack: Multiple Object Tracking with Enhanced Information Utilization
Abstract
Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR.
Keywords