Cross-modal retrieval based on multi-dimensional feature fusion hashing

Dongxiao Ren; Weihua Xu

doi:10.3389/fphy.2024.1379873

Frontiers in Physics (Jun 2024)

Cross-modal retrieval based on multi-dimensional feature fusion hashing

Dongxiao Ren,
Weihua Xu

Affiliations

Dongxiao Ren: Department of Data Science, School of Sciene, Zhejiang University of Science and Technology, Hangzhou, China
Weihua Xu: Department of Digital Finance, Quanzhou Branch of Industrial and Commercial Bank of China, Quanzhou, China

DOI: https://doi.org/10.3389/fphy.2024.1379873
Journal volume & issue: Vol. 12

Abstract

Read online

Along with the continuous breakthrough and popularization of information network technology, multi-modal data, including texts, images, videos, and audio, is growing rapidly. We can retrieve different modal data to meet our needs, so cross-modal retrieval has important theoretical significance and application value. In addition, because the data of different modalities can be mutually retrieved by mapping them to a unified Hamming space, hash codes have been extensively used in the cross-modal retrieval field. However, existing cross-modal hashing models generate hash codes based on single-dimension data features, ignoring the semantic correlation between data features in different dimensions. Therefore, an innovative cross-modal retrieval method using Multi-Dimensional Feature Fusion Hashing (MDFFH) is proposed. To better get the image’s multi-dimensional semantic features, a convolutional neural network, and Vision Transformer are combined to construct an image multi-dimensional fusion module. Similarly, we apply the multi-dimensional text fusion module to the text modality to obtain the text’s multi-dimensional semantic features. These two modules can effectively integrate the semantic features of data in different dimensions through feature fusion, making the generated hash code more representative and semantic. Extensive experiments and corresponding analysis results on two datasets indicate that MDFFH’s performance outdoes other baseline models.

Published in Frontiers in Physics

ISSN: 2296-424X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Physics
Website: https://www.frontiersin.org/journals/physics

About the journal

Abstract

Keywords