IEEE Access (Jan 2025)
Context-Aware Fine-Grained Product Recognition on Grocery Shelves
Abstract
Product recognition is a fine-grained image retrieval problem because grocery stores can contain several thousand products on their shelves, some of which have minimal visual differences. Given that the products on the store’s shelves change frequently, it would be practical to build a system using a single reference image or a few per product. This task is very challenging in computer vision, whereas humans solve it more effortlessly by relying on contextual information. Our work incorporates semantic and spatial context into a novel product recognition method. We take advantage of stores being organized into different sections and aisles, where similar products are placed nearby. First, we propose the Hierarchical Auxiliary Loss (HAL) for learning an organized feature space in which products from the same category, usually placed in the same store section, are close to each other. Second, we propose the Context-Aware Query Expansion (CAQE) module for the inference phase, in which each feature vector of a product on a shelf is expanded with the feature vectors of neighboring products. The amount of information exchanged between the two products depends on the similarity of their feature vectors and spatial distance on the store shelves. To demonstrate the effectiveness of our contributions, we conducted detailed experiments on publicly available grocery product datasets and showed that our method achieves state-of-the-art results.
Keywords