Beyond Knowledge Distillation: Collaborative Learning for Bidirectional Model Assistance

Jinzhuo Wang; Wenmin Wang; Wen Gao

doi:10.1109/ACCESS.2018.2854918

IEEE Access (Jan 2018)

Beyond Knowledge Distillation: Collaborative Learning for Bidirectional Model Assistance

Jinzhuo Wang,
Wenmin Wang,
Wen Gao

Affiliations

Jinzhuo Wang: ORCiD; School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Wenmin Wang: School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Wen Gao: School of Electronic Engineering and Computer Science, Peking University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2018.2854918
Journal volume & issue: Vol. 6
pp. 39490 – 39500

Abstract

Read online

Knowledge distillation (KD) is a powerful technique that enables a well-trained large model to assist a small model. However, KD is constrained in a teacher-student manner. Thus, this method may not be appropriate in general situations, where the learning abilities of two models are uncertain or not significantly different. In this paper, we propose a collaborative learning (CL) method, which is a flexible strategy to achieve bidirectional model assistance for two models using a mutual knowledge base (MKB). The MKB is used to collect mutual information and provide assistance, and it is updated along with the learning process of the two models and separately deployed when converged. We show that CL can be applied to any two deep neural networks and is easily extended to multiple networks. Compared with the teacher-student framework, CL can achieve bidirectional assistance and does not impose specific requirements on the involved models, such as pretraining and different abilities. The experimental results demonstrate that CL can efficiently improve the learning ability and convergence speed of the two models, with superior performance to a series of relevant methods, such as ensemble learning and a series of KD-based methods. More importantly, we show that the state-of-the-art models, such as DenseNet, can be greatly improved using CL along with other popular models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords