IEEE Open Journal of the Computer Society (Jan 2023)

Reverse Self-Distillation Overcoming the Self-Distillation Barrier

  • Shuiping Ni,
  • Xinliang Ma,
  • Mingfu Zhu,
  • Xingwang Li,
  • Yu-Dong Zhang

DOI
https://doi.org/10.1109/OJCS.2023.3288227
Journal volume & issue
Vol. 4
pp. 195 – 205

Abstract

Read online

Deep neural networks generally cannot gather more helpful information with limited data in image classification, resulting in poor performance. Self-distillation, as a novel knowledge distillation technique, integrates the roles of teacher and student into a single network to solve this problem. A better understanding of the efficiency of self-distillation is critical to its advancement. In this article, we provide a new perspective: the effectiveness of self-distillation comes not only from distillation but also from the supervisory information provided by the shallow networks. At the same time, we find a barrier that limits the effectiveness of self-distillation. Based on this, reverse self-distillation is proposed. In contrast to self-distillation, the internal knowledge flow is in the opposite direction. Experimental results show that reverse self-distillation can break the barrier of self-distillation and further improve the accuracy of networks. On average, 2.8% and 3.2% accuracy boosts are observed on CIFAR100 and TinyImageNet.

Keywords