Knowledge Distillation in Object Detection for Resource-Constrained Edge Computing

Arief Setyanto; Theopilus Bayu Sasongko; Muhammad Ainul Fikri; Dhani Ariatmanto; I. Made Artha Agastya; Rakandhiya Daanii Rachmanto; Affan Ardana; In Kee Kim

doi:10.1109/ACCESS.2025.3534020

IEEE Access (Jan 2025)

Knowledge Distillation in Object Detection for Resource-Constrained Edge Computing

Arief Setyanto,
Theopilus Bayu Sasongko,
Muhammad Ainul Fikri,
Dhani Ariatmanto,
I. Made Artha Agastya,
Rakandhiya Daanii Rachmanto,
Affan Ardana,
In Kee Kim

Affiliations

Arief Setyanto: ORCiD; Magister of Informatics, Universitas Amikom Yogyakarta, Sleman, Indonesia
Theopilus Bayu Sasongko: ORCiD; Department of Informatics, Universitas Amikom Yogyakarta, Sleman, Indonesia
Muhammad Ainul Fikri: ORCiD; Department of Informatics Engineering, Jember State Polytechnic, Jember, Indonesia
Dhani Ariatmanto: ORCiD; Magister of Informatics, Universitas Amikom Yogyakarta, Sleman, Indonesia
I. Made Artha Agastya: ORCiD; Department of Informatics, Universitas Amikom Yogyakarta, Sleman, Indonesia
Rakandhiya Daanii Rachmanto: School of Computing, University of Georgia, Athens, GA, USA
Affan Ardana: ORCiD; Department of Informatics, Universitas Amikom Yogyakarta, Sleman, Indonesia
In Kee Kim: ORCiD; School of Computing, University of Georgia, Athens, GA, USA

DOI: https://doi.org/10.1109/ACCESS.2025.3534020
Journal volume & issue: Vol. 13
pp. 18200 – 18214

Abstract

Read online

Edge computing, a distributed computing paradigm that places small yet capable computing devices near data sources and IoT sensors, is gaining widespread adoption in various real-world applications, such as real-time intelligent drones, autonomous vehicles, and robotics. Object detection (OD) is an essential task in computer vision. Although state-of-the-art deep learning-based OD methods achieve high detection rates, their large model size and high computational demands often hinder deployment on resource-constrained edge devices. Given their limited memory and computational power, edge devices like the Jetson Nano (J. Nano), Jetson Orin Nano (Orin Nano), and Raspberry Pi 4B (Raspi4B) require model optimization and compression techniques in order to deploy large OD models such as YOLO. YOLOv4 is a widely used OD model with a backbone for image feature extraction and a prediction layer. Originally, YOLOv4 was designed to use CSPDarkNet53 as its backbone, which requires significant computational power. In this paper, we propose replacing its backbone with a smaller model, such as MobileNetV2 and RepViT. In order to ensure the strong backbone performance, we perform knowledge distillation (KD), using CSPDarknet53 as the teacher and the smaller model as the student. We compare various KD algorithms to identify the technique that produces a smaller model with the modest accuracy drop. According to our experiments, Contrastive Representation Distillation (CRD) yields MobileNetV2 and RepViT with an acceptable accuracy drop. We consider both accuracy drop and model size to choose either MobileNetV2 or RepViT model to replace CSPDarknet53 in the modified YOLOv4 named M-YOLO-CRD and RV-YOLO-CRD. Our evaluation results demonstrate that RV-YOLO-CRD reduces 30% of model size and achieves better mean average precision (mAP) than M-YOLO-CRD. Our experiments show that the M-YOLO-CRD significantly reduces model size (from 245.5 MB to 35.76 MB) and inference time ( $6\times $ faster on CPU, $4\times $ faster on J. Nano, and $2.5\times $ faster on Orin Nano. While the precision decreased slightly (less than 4%), the model still performs well on edge devices. The M-YOLO-CRD achieved latency per frame at around 37 ms on Orin Nano, 168 ms on J. Nano, and 1310 ms on Raspi4B.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords