Symmetry (Aug 2024)

An Efficient Cross-Modal Privacy-Preserving Image–Text Retrieval Scheme

  • Kejun Zhang,
  • Shaofei Xu,
  • Yutuo Song,
  • Yuwei Xu,
  • Pengcheng Li,
  • Xiang Yang,
  • Bing Zou,
  • Wenbin Wang

DOI
https://doi.org/10.3390/sym16081084
Journal volume & issue
Vol. 16, no. 8
p. 1084

Abstract

Read online

Preserving the privacy of the ever-increasing multimedia data on the cloud while providing accurate and fast retrieval services has become a hot topic in information security. However, existing relevant schemes still have significant room for improvement in accuracy and speed. Therefore, this paper proposes a privacy-preserving image–text retrieval scheme called PITR. To enhance model performance with minimal parameter training, we freeze all parameters of a multimodal pre-trained model and incorporate trainable modules along with either a general adapter or a specialized adapter, which are used to enhance the model’s ability to perform zero-shot image classification and cross-modal retrieval in general or specialized datasets, respectively. To preserve the privacy of outsourced data on the cloud and the privacy of the user’s retrieval process, we employ asymmetric scalar-product-preserving encryption technology suitable for inner product calculation, and we employ distributed index storage technology and construct a two-level security model. We construct a hierarchical index structure to speed up query matching among massive high-dimensional index vectors. Experimental results demonstrate that our scheme can provide users with secure, accurate, fast cross-modal retrieval service while preserving data privacy.

Keywords