IEEE Access (Jan 2023)

Universal Image Embedding: Retaining and Expanding Knowledge With Multi-Domain Fine-Tuning

  • Socratis Gkelios,
  • Anestis Kastellos,
  • Yiannis S. Boutalis,
  • Savvas A. Chatzichristofis

DOI
https://doi.org/10.1109/ACCESS.2023.3267804
Journal volume & issue
Vol. 11
pp. 38208 – 38217

Abstract

Read online

The overall purpose of this study is to propose a novel fine-tuning method for the CLIP architecture that enables the retention of pre-existing knowledge from large datasets and the creation of a domain-agnostic image encoder for universal image embedding, addressing the challenge of transferring knowledge from source to target tasks using deep learning models. The basic design of the study involves applying the proposed method directly (without fine-tuning) to a wide range of instance retrieval and recognition tasks to evaluate its effectiveness. The study’s major findings indicate that the proposed method significantly enhances performance on unseen domains without requiring separate fine-tuning for each domain. The authors’ success in the Google Universal Image Embedding competition, where they were awarded a Gold medal out of 1200 teams, inspired their proposed method. These results have significant implications for real-life applications where multiple domains are common. In conclusion, the study offers a practical solution for transfer learning that addresses the challenges of dealing with multiple domains and advances deep learning, potentially inspiring further research in this area and driving progress in the field.

Keywords