IEEE Access (Jan 2021)

Batch Mode Active Learning Based on Multi-Set Clustering

  • Yazhou Yang,
  • Xiaoqing Yin,
  • Yang Zhao,
  • Jun Lei,
  • Weili Li,
  • Zhe Shu

DOI
https://doi.org/10.1109/ACCESS.2021.3053003
Journal volume & issue
Vol. 9
pp. 51452 – 51463

Abstract

Read online

Batch mode active learning, where a batch of samples is simultaneously selected and labeled, is a challenging task. The challenge lies in how to maintain the informativeness and keep the diversity of selected samples concurrently. We propose a novel batch mode active learning that balances the informativeness and representativeness using multi-set clustering. Our method utilizes a sequential active learner to retain the informativeness by providing a ranking of unlabeled samples and constructing multiple informative sets for the subsequent clusterings. K-means clustering is used to minimize the redundancy among these samples and to improve the representativeness. Finally, the optimal batch chosen is the one minimizing the expected predictive variance on all the data. Our experimental results on a large number of benchmark datasets demonstrate excellent performance of the proposed method in comparison with current state-of-the-art batch mode active learning approaches.

Keywords