IEEE Access (Jan 2024)
Multi-Label Zero-Shot Learning With Adversarial and Variational Techniques
Abstract
Multi-label zero-shot learning expands upon the traditional single-label zero-shot learning paradigm by addressing the challenge of accurately classifying images containing multiple unseen classes, which are not part of the training data. Current techniques rely on attention mechanisms to tackle the complexities of multi-label zero-shot learning (ZSL) and generalized zero-shot learning (GZSL). However, the generation of features, especially within the context of a generative approach, remains an unexplored area. In this paper, we propose a generative approach that leverages the capabilities of Conditional Variational Autoencoder (CVAE) and Conditional Generative Adversarial Network (CGAN) to enhance the quality of generative data for both multi-label ZSL and GZSL. Additionally, we introduce a novel “Regressor” as a supplementary tool to improve the reconstruction of visual features. This Regressor operates in conjunction with a “cycle-consistency loss” to ensure that the generated features preserve the key qualities of the original features even after undergoing transformations. To gauge the efficacy of our proposed approach, we conducted comprehensive experiments on two widely recognized benchmark datasets: NUS-WIDE and MS COCO. Our evaluation spanned both multi-label ZSL and GZSL scenarios. Notably, our approach yielded significant enhancements in mean Average Precision (mAP) for both datasets. Specifically, we observed a 0.2% increase in performance on the NUS-WIDE dataset and a notable 2.6% improvement on the MS COCO dataset in the context of Multi-label ZSL. The results clearly demonstrate that our generative approach outperforms existing methods on these widely-recognized datasets.
Keywords