IEEE Access (Jan 2024)

Decoupled Time-Dimensional Progressive Self-Distillation With Knowledge Calibration for Edge Computing-Enabled AIoT

  • Yingchao Wang,
  • Wenqi Niu,
  • Hanpo Hou

DOI
https://doi.org/10.1109/ACCESS.2024.3512789
Journal volume & issue
Vol. 12
pp. 184883 – 184895

Abstract

Read online

The time-dimensional self-distillation seeks to transfer knowledge from earlier historical models to subsequent ones with minimal computational overhead. This enables model self-augmentation without the need for large teacher models, making it particularly suitable for resource-constrained edge and fog computing environments within the Artificial Intelligence of Things (AIoT). However, the confidence in the historical model is often insufficient, and its output exists at a higher semantic level. Directly learning from the historical model’s output proves to be inefficient. Therefore, this study proposes a novel time-dimensional Decoupled Progressive Self-Distillation (DPSD) method that incorporates knowledge calibration and decoupling. DPSD calibrates the Target Class Knowledge (TCK) of historical output by constructing a smoother fusion label that integrates the ground truth. This fusion label is then decoupled into TCK and Non-target Class Knowledge (NCK), where the NCK is calibrated by introducing Zipf’s label. By flexibly learning TCK and NCK, the effective transfer of historical model knowledge to subsequent student models is achieved. Extensive experiments on four neural networks across the CIFAR-100 and Tiny-ImageNet datasets demonstrate that DPSD achieved average Top-1 accuracy improvements of 1.74% and 1.81%, respectively, with maximum gains of 2.19% and 2.42%, surpassing the performance of seven other state-of-the-art self-distillation methods.

Keywords