Target Capacity Filter Pruning Method for Optimized Inference Time Based on YOLOv5 in Embedded Systems

Jihun Jeon; Jaemyung Kim; Jin-Ku Kang; Sungtae Moon; Yongwoo Kim

doi:10.1109/ACCESS.2022.3188323

IEEE Access (Jan 2022)

Target Capacity Filter Pruning Method for Optimized Inference Time Based on YOLOv5 in Embedded Systems

Jihun Jeon,
Jaemyung Kim,
Jin-Ku Kang,
Sungtae Moon,
Yongwoo Kim

Affiliations

Jihun Jeon: ORCiD; Department of Electrical and Computer Engineering, Inha University, Incheon, South Korea
Jaemyung Kim: ORCiD; Department of Electrical and Computer Engineering, Inha University, Incheon, South Korea
Jin-Ku Kang: ORCiD; Department of Electrical and Computer Engineering, Inha University, Incheon, South Korea
Sungtae Moon: Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan-si, South Korea
Yongwoo Kim: ORCiD; Department of System Semiconductor Engineering, Sangmyung University, Cheonan-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3188323
Journal volume & issue: Vol. 10
pp. 70840 – 70849

Abstract

Read online

Recently, convolutional neural networks (CNNs), which exhibit excellent performance in the field of computer vision, have been in the spotlight. However, as the networks become wider for higher accuracy, the number of parameters and the computational costs increase exponentially. Therefore, it is challenging to use deep learning networks in embedded environments with limited resources, computational performance, and power. Moreover, CNNs consume a great deal of time for inference. To solve this problem, we propose a practical method for filter pruning to provide an optimal network architecture for target capacity and inference acceleration. After revealing the correlation between the inference time and the FLOPs, we proposed a method to generate a network with the desired inference time. Various object detection datasets were used to evaluate the performance of the proposed filter pruning method. The inference time of the pruned network was measured and analyzed using the NVIDIA Jetson Xavier NX platform. As a result of pruning the number of parameters and FLOPs of the YOLOv5 network in the PASCAL VOC dataset by 30%, 40%, and 50%, the mAP decreased by 0.6%, 2.3%, and 2.9%, respectively, while the inference time was improved by 14.3%, 26.4%, and 34.5%, respectively.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords