IEEE Open Journal of the Industrial Electronics Society (Jan 2024)
Dual Modality Reverse Reranking (DM-RR) Based Image Retrieval Framework
Abstract
Retrieval of a product with desired modifications from a vast inventory of online industrial platforms is frequently encountered in our daily life. This study presents a specialized framework to retrieve user's queried product with its desired changes incorporated. To facilitate interaction between the end-user and agent in such scenarios, a multimodal content-based image retrieval system is essential. The system extracts textual and visual attributes, combining them through inductive learning to a unified representation. It is based on an in-depth understanding of visual characteristics that are modified by textual semantics. Lastly, a novel reverse reranking (RR) algorithm arranges the joint representation of dual modality queries and their corresponding target images for efficient retrieval. The proposed framework is novel compared to earlier methodologies. First, it achieves successful fusion of two different modalities. Second, it introduces a RR algorithm in the inference stage for efficient retrieval. The proposed framework's enhanced performance has been assessed using the Fashion-200 K and MIT-States real-world benchmark datasets. The proposed system can be used in real-world applications subject to its practical implications, such as generalization to diverse domains, availability of domain specific data, nature of the data and queries, and availability of computational resources.
Keywords