Applied Sciences (Sep 2022)
A Study on the Channel Expansion VAE for Content-Based Image Retrieval
Abstract
Content-based image retrieval (CBIR) focuses on video searching with fine-tuning of pre-trained off-the-shelf features. CBIR is an intuitive method for image retrieval, although it still requires labeled datasets for fine-tuning due to the inefficiency caused by annotation. Therefore, we explored an unsupervised model for feature extraction of image contents. We used a variational auto-encoder (VAE) expanding channel of neural networks and studied the activation of layer outputs. In this study, the channel expansion method boosted the capability of image retrieval by exploring more kernels and selecting a layer of comparatively activated object region. The experiment included a comparison of channel expansion and visualization of each layer in the encoder network. The proposed model achieved (52.7%) mAP, which outperformed (36.5%) the existing VAE on the MNIST dataset.
Keywords