SiamSMN: Siamese Cross-Modality Fusion Network for Object Tracking

Shuo Han; Lisha Gao; Yue Wu; Tian Wei; Manyu Wang; Xu Cheng

doi:10.3390/info15070418

Information (Jul 2024)

SiamSMN: Siamese Cross-Modality Fusion Network for Object Tracking

Shuo Han,
Lisha Gao,
Yue Wu,
Tian Wei,
Manyu Wang,
Xu Cheng

Affiliations

Shuo Han: Nanjing Power Supply Branch, State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210024, China
Lisha Gao: Nanjing Power Supply Branch, State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210024, China
Yue Wu: Nanjing Power Supply Branch, State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210024, China
Tian Wei: Nanjing Power Supply Branch, State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210024, China
Manyu Wang: School of Computer and Cyberspace Security, Nanjing University of Information Science and Technology, Nanjing 210044, China
Xu Cheng: School of Computer and Cyberspace Security, Nanjing University of Information Science and Technology, Nanjing 210044, China

DOI: https://doi.org/10.3390/info15070418
Journal volume & issue: Vol. 15, no. 7
p. 418

Abstract

Read online

The existing Siamese trackers have achieved increasingly successful results in visual object tracking. However, the interactive fusion among multi-layer similarity maps after cross-correlation has not been fully studied in previous Siamese network-based methods. To address this issue, we propose a novel Siamese network for visual object tracking, named SiamSMN, which consists of a feature extraction network, a multi-scale fusion module, and a prediction head. First, the feature extraction network is used to extract the features of the template image and the search image, which is calculated by a depth-wise cross-correlation operation to produce multiple similarity feature maps. Second, we propose an effective multi-scale fusion module that can extract global context information for object search and learn the interdependencies between multi-level similarity maps. In addition, to further improve tracking accuracy, we design a learnable prediction head module to generate a boundary point for each side based on the coarse bounding box, which can solve the problem of inconsistent classification and regression during the tracking. Extensive experiments on four public benchmarks demonstrate that the proposed tracker has a competitive performance among other state-of-the-art trackers.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords