IEEE Access (Jan 2020)
Model-Based Deep Encoding Based on USB Transmission for Modern Edge Computing Architectures
Abstract
With the advance of deep neural networks (DNNs), artificial intelligence (AI) has been widely applied to various applications in our daily lives. These DNN-based models can be stored in portable storage disks or low-power Neural Compute Sticks. They can then be deployed in edge devices through the USB interface for AI-based applications, such as Automatic Diagnosis Systems or Smart Surveillance Systems, which provides solutions to incorporating AI into the Internet of Things (IoT). In this work, based on our observation and careful analysis, we propose a model-based deep encoding method built upon Huffman coding to compress a DNN model transmitted through the USB interface to edge devices. Based on the proposed lopsidedness estimation approach, we can exploit a modified Huffman coding method to increase the USB transmission efficiency for quantized DNN models while reducing the computational cost entailed by the coding process. We conducted experiments on several benchmarking DNN models compressed using three emerging quantization techniques, which indicates that our method can achieve a high compression ratio of 88.72%, with 93.76% of the stuffing bits saved on average.
Keywords