Journal of King Saud University: Computer and Information Sciences (Jul 2023)

Compressing medical deep neural network models for edge devices using knowledge distillation

  • F. MohiEldeen Alabbasy,
  • A.S. Abohamama,
  • Mohammed F. Alrahmawy

Journal volume & issue
Vol. 35, no. 7
p. 101616

Abstract

Read online

Recently, deep neural networks (DNNs) have been used successfully in many fields, particularly, in medical diagnosis. However, deep learning (DL) models are expensive in terms of memory and computing resources, which hinders their implementation in limited-resources devices or for delay-sensitive systems. Therefore, these deep models need to be accelerated and compressed to smaller sizes to be deployed on edge devices without noticeably affecting their performance. In this paper, recent accelerating and compression approaches of DNN are analyzed and compared regarding their performance, applications, benefits, and limitations with a more focus on the knowledge distillation approach as a successful emergent approach in this field. In addition, a framework is proposed to develop knowledge distilled DNN models that can be deployed on fog/edge devices for automatic disease diagnosis. To evaluate the proposed framework, two compressed medical diagnosis systems are proposed based on knowledge distillation deep neural models for both COVID-19 and Malaria. The experimental results show that these knowledge distilled models have been compressed by 18.4% and 15% of the original model and their responses accelerated by 6.14x and 5.86%, respectively, while there were no significant drop in their performance (dropped by 0.9% and 1.2%, respectively). Furthermore, the distilled models are compared with other pruned and quantized models. The obtained results revealed the superiority of the distilled models in terms of compression rates and response time.

Keywords