IEEE Access (Jan 2024)

Self-Training Algorithm With Block Similar Neighbor Editing

  • Wenwang Bai,
  • Cuihong Zhang,
  • Zhengguo Yang,
  • He Yang

DOI
https://doi.org/10.1109/ACCESS.2024.3440915
Journal volume & issue
Vol. 12
pp. 110418 – 110431

Abstract

Read online

In the real world, there are only a small amount of data with labels. To make full use of the potential structural information of unlabeled data to train a better classifier, researchers have proposed many semi-supervised learning algorithms. Among these algorithms, self-training is one of the most widely used semi-supervised learning frameworks due to its simplicity. How to select high-confidence samples is a crucial step for self-training. If the misclassified samples are selected as high-confidence samples, this error will be amplified in the iterative process, which affects the performance of the final classifier. To alleviate the impact of this problem, this paper proposes a self-training algorithm with block-similar neighbor editing (STBSNE). STBSNE calculates the distance between samples by the block-based dissimilarity measure, which improves the classification performance on high-dimensional data sets. STBSNE defines the block-estimated neighbor relationship, builds the block-estimated neighbor relationship graph, and proposes the block estimated neighbor editing method to identify outliers and noise points, and edits them to improve the quality of the high-confidence sample selected. Experimental results on 16 benchmark data sets verify the superior performance of the proposed STBSNE compared with seven state-of-the-art algorithms.

Keywords