3D Point Cloud Object Tracking Based on Multi-level Fusion of Transformer Features

LI Zhijie, LIANG Bowen, DING Xinmiao, GUO Wen

doi:10.3778/j.issn.1673-9418.2401071

Jisuanji kexue yu tansuo (Nov 2024)

3D Point Cloud Object Tracking Based on Multi-level Fusion of Transformer Features

LI Zhijie, LIANG Bowen, DING Xinmiao, GUO Wen

Affiliations

LI Zhijie, LIANG Bowen, DING Xinmiao, GUO Wen: 1. School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai, Shandong 264000, China 2. School of Computer Science and Technology, Shandong Technology and Business University, Yantai, Shandong 264000, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2401071
Journal volume & issue: Vol. 18, no. 11
pp. 3006 – 3014

Abstract

Read online

During the 3D point cloud object tracking, some issues such as occlusion, sparsity, and random noise often arise. To address these challenges, this paper proposes a novel approach to 3D point cloud object tracking based on multi-level fusion of Transformer features. The method mainly consists of the point attention embedding module and the point attention enhancement module, which are used for feature extraction and feature matching processes, respectively. Firstly, by embedding two attention mechanisms into each other to form the point attention embedding module and fusing it with the relationship-aware sampling method proposed by PTTR (point relation transformer for tracking), the purpose of fully extracting features is achieved. Subsequently, the feature information is input into the point attention enhancement module, and through cross-attention, features from different levels are matched sequentially to achieve the goal of deep fusion of global and local features. Moreover, to obtain discriminative feature fusion maps, a residual network is employed to connect the fusion results from different layers. Finally, the feature fusion map is input into the target prediction module to achieve precise prediction of the final 3D target object. Experimental validation on KITTI, nuScenes, and Waymo datasets demonstrates the effectiveness of the proposed method. Excluding few-shot data, the proposed method achieves an average improvement of 1.4 percentage points in success and 1.4 percentage points in precision in terms of object tracking.

3d point cloud; siamese network; object tracking; transformer; feature fusion

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords