Electronic Research Archive (Oct 2024)
Open-world barely-supervised learning via augmented pseudo labels
Abstract
Open-world semi-supervised learning (OWSSL) has received significant attention since it addresses the issue of unlabeled data containing classes not present in the labeled data. Unfortunately, existing OWSSL methods still rely on a large amount of labeled data from seen classes, overlooking the reality that a substantial amount of labels is difficult to obtain in real scenarios. In this paper, we explored a new setting called open-world barely-supervised learning (OWBSL), where only a single label was provided for each seen class, greatly reducing labeling costs. To tackle the OWBSL task, we proposed a novel framework that leveraged augmented pseudo-labels generated for the unlabeled data. Specifically, we first generated initial pseudo-labels for the unlabeled data using visual-language models. Subsequently, to ensure that the pseudo-labels remained reliable while being updated during model training, we enhanced them using predictions from weak data augmentation. This way, we obtained the augmented pseudo-labels. Additionally, to fully exploit the information from unlabeled data, we incorporated consistency regularization based on strong and weak augmentations into our framework. Our experimental results on multiple benchmark datasets demonstrated the effectiveness of our method.
Keywords