IEEE Access (Jan 2019)

Semi-Supervised Fine-Grained Image Categorization Using Transfer Learning With Hierarchical Multi-Scale Adversarial Networks

  • Peng Chen,
  • Peng Li,
  • Qing Li,
  • Dezheng Zhang

DOI
https://doi.org/10.1109/ACCESS.2019.2934476
Journal volume & issue
Vol. 7
pp. 118650 – 118668

Abstract

Read online

Fine-grained image categorization is still a challenging computer vision problem in recent years. Most of existing methods highly rely on massive labeled data which are scarce in many real world applications. It should also be noticed that progressive learning demands of existing data is very common today. That is, we may pay attention to more fine-grained information (like arctic tern, black tern, buttercup or tulip) in an existing data set with labels like “bird” and “flower”. It is reasonable to believe that the existing labels and model with transferable knowledge would be helpful to another related but different, fine-grained recognition task. In this context, an improved transfer deep learning approach with hierarchical multi-adversarial networks is proposed in this paper. With this approach, cross domain features are extracted by advanced deep encoders coarsely. After that, we annotate a small amount of images in the target domain, thereby creating the “active labels” which can provide instructions for adversarial learning. Then, the GAN-based hierarchical model is utilized to select cross domain categories and enhance related features so as to facilitate an effective transfer. In order to exploit useful local features, a novel adaptive attention mechanism, Region Adversarial Network (RAN) which can select attention regions during adversarial learning and generate valuable fine-grained features, is introduced in the article. We call the proposed hierarchical framework “Attentional Multi-Adversarial Networks (AMAN)”. Experimental results show that AMAN is able to augment cross domain features well-directly and build an effective classifier for fine-grained categorization in the target domain with fewer training samples and higher accuracies.

Keywords