NeuRes: Highly Activated Neurons Responses Transfer via Distilling Sparse Activation Maps

Sharmen Akhter; Md Imtiaz Hossain; Md Delowar Hossain; Eui-Nam Huh

doi:10.1109/ACCESS.2022.3227804

IEEE Access (Jan 2022)

NeuRes: Highly Activated Neurons Responses Transfer via Distilling Sparse Activation Maps

Sharmen Akhter,
Md Imtiaz Hossain,
Md Delowar Hossain,
Eui-Nam Huh

Affiliations

Sharmen Akhter: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin-si, South Korea
Md Imtiaz Hossain: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin-si, South Korea
Md Delowar Hossain: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin-si, South Korea
Eui-Nam Huh: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3227804
Journal volume & issue: Vol. 10
pp. 131555 – 131566

Abstract

Read online

In recent years, Knowledge Distillation has obtained a significant interest in mobile, edge, and IoT devices due to its ability to transfer knowledge from the large and complex teacher to the lightweight student network. Intuitively, Knowledge Distillation refers to forcing the student to mimic the teacher’s neuron responses to improve the generalization of the student by deploying the distillation losses as the regularization terms. However, the non-linearity of the hidden layers and the high dimensionality of the feature maps make the knowledge transfer a rigorous task. Though numerous methods have been proposed to transfer the teacher’s neuron responses in the form of diverse feature characteristics such as attention, contrastive representation, and so on, to the best of our knowledge, no prior works considered feature-level non-linearity during distillation. In this work, we ask, does feature-level non-linearity-based approaches can improve student performance? For investigating those concerns, we propose a novel knowledge distillation technique called the NeuRes (Neuron’s Responses) via distilling the Sparse Activation Maps (SAMs) to transfer the highly activated Neurons Responses to the student to enhance the representation capability. Proposed NeuRes selects the highly activated neuron responses that produce Sparse Activation Maps (SAMs) while transferring the knowledge based on activation normalization. Our proposed NeuRes also transfers the translation invariant features using auxiliary classifiers and augmented data to improve students’ generalization. The detailed ablation studies and extensive experiments on model compression, transferability, adversarial robustness, and few-shot learning verify that NeuRes outperforms state-of-the-art distillation techniques on the standard benchmark datasets.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords