IEEE Access (Jan 2024)

Dense-PU: Learning a Density-Based Boundary for Positive and Unlabeled Learning

  • Vasileios Sevetlidis,
  • George Pavlidis,
  • Spyridon G. Mouroutsos,
  • Antonios Gasteratos

DOI
https://doi.org/10.1109/ACCESS.2024.3420453
Journal volume & issue
Vol. 12
pp. 90287 – 90298

Abstract

Read online

In this study, a novel approach for solving the PU learning problem is proposed based on an anomaly detection strategy. A Convolutional Autoencoder (CAE) is used to extract latent encodings from positive-labeled data, which are then linearly combined to acquire new samples that lie between them. These new samples were used as embeddings to define a boundary that approximates the positive class. Data points that were significantly different from the majority of the data were assumed to be negative samples. Once a set of negative samples is obtained, the problem can be treated as a typical binary-classification problem. This approach was evaluated using benchmark image datasets, CIFAR-10 and Fashion-MNIST, yielding F1-scores of 91.96% and 94.80% on the two datasets respectively. These results demonstrate the efficacy of Dense-PU in enhancing classification performance in identifying negative samples in unlabeled data.

Keywords