Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation

Bojun Zhou; Tianyu Cheng; Jiahao Zhao; Chunkai Yan; Ling Jiang; Xinsong Zhang; Juping Gu

doi:10.3390/s24061815

Sensors (Mar 2024)

Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation

Bojun Zhou,
Tianyu Cheng,
Jiahao Zhao,
Chunkai Yan,
Ling Jiang,
Xinsong Zhang,
Juping Gu

Affiliations

Bojun Zhou: School of Information Science and Technology, Nantong University, Nantong 226019, China
Tianyu Cheng: School of Electrical Engineering, Nantong University, Nantong 226019, China
Jiahao Zhao: School of Information Science and Technology, Nantong University, Nantong 226019, China
Chunkai Yan: School of Information Science and Technology, Nantong University, Nantong 226019, China
Ling Jiang: School of Information Science and Technology, Nantong University, Nantong 226019, China
Xinsong Zhang: School of Electrical Engineering, Nantong University, Nantong 226019, China
Juping Gu: School of Information Science and Technology, Nantong University, Nantong 226019, China

DOI: https://doi.org/10.3390/s24061815
Journal volume & issue: Vol. 24, no. 6
p. 1815

Abstract

Read online

In recent computer vision research, the pursuit of improved classification performance often leads to the adoption of complex, large-scale models. However, the actual deployment of such extensive models poses significant challenges in environments constrained by limited computing power and storage capacity. Consequently, this study is dedicated to addressing these challenges by focusing on innovative methods that enhance the classification performance of lightweight models. We propose a novel method to compress the knowledge learned by a large model into a lightweight one so that the latter can also achieve good performance in few-shot classification tasks. Specifically, we propose a dual-faceted knowledge distillation strategy that combines output-based and intermediate feature-based methods. The output-based method concentrates on distilling knowledge related to base class labels, while the intermediate feature-based approach, augmented by feature error distribution calibration, tackles the potential non-Gaussian nature of feature deviations, thereby boosting the effectiveness of knowledge transfer. Experiments conducted on MiniImageNet, CIFAR-FS, and CUB datasets demonstrate the superior performance of our method over state-of-the-art lightweight models, particularly in five-way one-shot and five-way five-shot tasks.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords