IEEE Access (Jan 2018)

Dual Learning-Based Safe Semi-Supervised Learning

  • Haitao Gan,
  • Zhenhua Li,
  • Yingle Fan,
  • Zhizeng Luo

DOI
https://doi.org/10.1109/ACCESS.2017.2784406
Journal volume & issue
Vol. 6
pp. 2615 – 2621

Abstract

Read online

In many real-world applications, labeled instances are generally limited and expensively collected, while the most instances are unlabeled and the amount is often sufficient. Therefore, semi-supervised learning (SSL) has attracted much attention, since it is an effective tool to discover the unlabeled instances. However, how to safely make use of the unlabeled instances is an emerging and interesting problem in SSL. Hence, we propose DuAL Learning-based sAfe Semi-supervised learning (DALLAS), which employs dual learning to estimate the safety or risk of the unlabeled instances. To realize the safe exploitation of the unlabeled instances, our basic idea is to use supervised learning (SL) to analyze the risk of the unlabeled instances. First, DALLAS utilizes a primal model obtained by dual learning to classify each unlabeled instance and then uses a dual model to reconstruct the unlabeled instances according to the obtained classification results. The risk can be measured by analyzing the reconstruction error and predictions of the original and reconstructed unlabeled instances. If the error is small and the predictions are equal, the unlabeled instance may be safe. Otherwise, the instance may be risky and its output should be approach to be that obtained by SL. Finally, we embed a risk-based regularization term into SSL. Hence, the outputs of our algorithm are a tradeoff between those of SL and SSL. In particular, we utilize respectively regularized least squares (RLS) and Laplacian RLS for SL and SSL. To verify the effectiveness of the proposed safe mechanism in DALLAS, we carry out a series of experiments on several data sets by the comparison with the state-of-the-art supervised, semi-supervised, and safe semi-supervised learning methods and the results demonstrate that DALLAS can effectively reduce the risk of the unlabeled instances.

Keywords