IEEE Access (Jan 2019)

Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval

  • Zhenyan Ji,
  • Weina Yao,
  • Wei Wei,
  • Houbing Song,
  • Huaiyu Pi

DOI
https://doi.org/10.1109/ACCESS.2019.2899536
Journal volume & issue
Vol. 7
pp. 23667 – 23674

Abstract

Read online

With the rapid growth of multimodal data, the cross-modal search has widely attracted research interests. Due to its efficiency on storage and computing, hashing-based methods are broadly used for large scale cross-modal retrieval. Most existing hashing methods are designed based on binary supervision, which transforms complex relationships of multi-label data into simple similar or dissimilar. However, few methods have explored the rich semantic information implicit in multi-label data to improve the accuracy of searching results. In this paper, the multi-level semantic supervision generating approach is proposed by exploring the label relevance. And a deep hashing framework is designed for multi-label image-text cross retrieval tasks. It can simultaneously capture the binary similarity and the complex multi-level semantic structure of data in different forms. Moreover, the effects of three different convolutional neural networks, CNN-F, VGG-16, and ResNet-50, on the retrieval results are compared. The experimental results on an open source cross-modal dataset show that our approach outperforms several state-of-the-art hashing methods, and the retrieval result on the CNN-F network is better than VGG-16 and ResNet-50.

Keywords