ETRI Journal (Jan 2019)

Layer‐wise hint‐based training for knowledge transfer in a teacher‐student framework

  • Ji‐Hoon Bae,
  • Junho Yim,
  • Nae‐Soo Kim,
  • Cheol‐Sig Pyo,
  • Junmo Kim

DOI
https://doi.org/10.4218/etrij.2018-0152
Journal volume & issue
Vol. 41, no. 2
pp. 242 – 253

Abstract

Read online

We devise a layer‐wise hint training method to improve the existing hint‐based knowledge distillation (KD) training approach, which is employed for knowledge transfer in a teacher‐student framework using a residual network (ResNet). To achieve this objective, the proposed method first iteratively trains the student ResNet and incrementally employs hint‐based information extracted from the pretrained teacher ResNet containing several hint and guided layers. Next, typical softening factor‐based KD training is performed using the previously estimated hint‐based information. We compare the recognition accuracy of the proposed approach with that of KD training without hints, hint‐based KD training, and ResNet‐based layer‐wise pretraining using reliable datasets, including CIFAR‐10, CIFAR‐100, and MNIST. When using the selected multiple hint‐based information items and their layer‐wise transfer in the proposed method, the trained student ResNet more accurately reflects the pretrained teacher ResNet's rich information than the baseline training methods, for all the benchmark datasets we consider in this study.

Keywords