Decoupled knowledge distillation method based on meta-learning

Wenqing Du; Liting Geng; Jianxiong Liu; Zhigang Zhao; Chunxiao Wang; Jidong Huo

High-Confidence Computing (Mar 2024)

Decoupled knowledge distillation method based on meta-learning

Wenqing Du,
Liting Geng,
Jianxiong Liu,
Zhigang Zhao,
Chunxiao Wang,
Jidong Huo

Affiliations

Wenqing Du: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China
Liting Geng: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China
Jianxiong Liu: Aerospace Science & Industry Network Information Development Co., LTD, Beijing 100854, China
Zhigang Zhao: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.
Chunxiao Wang: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.
Jidong Huo: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China

Journal volume & issue: Vol. 4, no. 1
p. 100164

Abstract

Read online

With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.

Published in High-Confidence Computing

ISSN: 2667-2952 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/high-confidence-computing

About the journal

Abstract

Keywords