Heuristic Compression Method for CNN Model Applying Quantization to a Combination of Structured and Unstructured Pruning Techniques

Danhe Tian; Shinichi Yamagiwa; Koichi Wada

doi:10.1109/ACCESS.2024.3399541

IEEE Access (Jan 2024)

Heuristic Compression Method for CNN Model Applying Quantization to a Combination of Structured and Unstructured Pruning Techniques

Danhe Tian,
Shinichi Yamagiwa,
Koichi Wada

Affiliations

Danhe Tian: Doctoral Program in Computer Science, University of Tsukuba, Tsukuba, Ibaraki, Japan
Shinichi Yamagiwa: ORCiD; Faculty of Engineering, Information and Systems, University of Tsukuba, Tsukuba, Ibaraki, Japan
Koichi Wada: Faculty of Engineering, Information and Systems, University of Tsukuba, Tsukuba, Ibaraki, Japan

DOI: https://doi.org/10.1109/ACCESS.2024.3399541
Journal volume & issue: Vol. 12
pp. 66680 – 66689

Abstract

Read online

Model Compression is an actively pursued research field in recent years with the goal of deploying state-of-the-art deep neural networks. It is targeted to implementations which are based on power constrained and resource limited devices as the reduced model achieves without significant accuracy loss, but with effective resource size reduction. The network pruning and the weight quantization techniques are well-known model compression methods. Our previous work successfully demonstrated significant reductions regarding the network model size by applying a managed combination of the structured and unstructured pruning methods. In order to achieve further reduction of the model, this paper introduces new heuristic methods that employ a weight quantization technique with both structured and unstructured pruning methods as those keep a given target accuracy. We experimentally demonstrate the performance evaluations of the proposed method by applying it to the actual state-of-the-art CNN models of VGGNet, ResNet and DenseNet under well-known CIFAR-10 dataset. In the best case during our experimental outcomes, the proposed method achieves the reduction of 28 times less model size and 76 times less compression processing time compared to the brute-force search method.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords