IEEE Access (Jan 2024)
Dense-PU: Learning a Density-Based Boundary for Positive and Unlabeled Learning
Abstract
In this study, a novel approach for solving the PU learning problem is proposed based on an anomaly detection strategy. A Convolutional Autoencoder (CAE) is used to extract latent encodings from positive-labeled data, which are then linearly combined to acquire new samples that lie between them. These new samples were used as embeddings to define a boundary that approximates the positive class. Data points that were significantly different from the majority of the data were assumed to be negative samples. Once a set of negative samples is obtained, the problem can be treated as a typical binary-classification problem. This approach was evaluated using benchmark image datasets, CIFAR-10 and Fashion-MNIST, yielding F1-scores of 91.96% and 94.80% on the two datasets respectively. These results demonstrate the efficacy of Dense-PU in enhancing classification performance in identifying negative samples in unlabeled data.
Keywords