A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding

Jung-Heum Kang; Muhammad Salman Ali; Hye-Won Jeong; Chang-Kyun Choi; Younhee Kim; Se Yoon Jeong; Sung-Ho Bae; Hui Yong Kim

doi:10.1109/ACCESS.2023.3260223

IEEE Access (Jan 2023)

A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding

Jung-Heum Kang,
Muhammad Salman Ali,
Hye-Won Jeong,
Chang-Kyun Choi,
Younhee Kim,
Se Yoon Jeong,
Sung-Ho Bae,
Hui Yong Kim

Affiliations

Jung-Heum Kang: Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
Muhammad Salman Ali: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
Hye-Won Jeong: Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
Chang-Kyun Choi: Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
Younhee Kim: Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
Se Yoon Jeong: ORCiD; Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
Sung-Ho Bae: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
Hui Yong Kim: ORCiD; Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea

DOI: https://doi.org/10.1109/ACCESS.2023.3260223
Journal volume & issue: Vol. 11
pp. 34198 – 34209

Abstract

Read online

Recently, video and image compression methods using neural networks have received much attention. In MPEG standardization, Video Coding for Machine (VCM) is a newly arising topic which attempts to compress features/images for the purpose of machine vision tasks. Especially, compressing features has advantages in terms of privacy protection and computation off-loading. In this paper, we propose an effective feature compression method equipped with a super-resolution (SR) module for features. Our main motivation comes from the observation that features are somewhat robust to spatial distortions (e.g., AWGN, blur, quantization distortions, coding artifacts), which leads us to integrating an SR module into the compression framework. We also further explore the best training strategy of the proposed method, i.e., finding the best combination of various losses and proper input feature shapes. Our comprehensive experiments show that the proposed method outperforms the baseline in the original VCM anchor scenario on various QP values with Versatile Video Coding (VVC). Specifically, the proposed framework achieved up to 50% BD-rate reduction compared to the conventional P-layer feature map compression method for the object detection task on the OpenImage dataset.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords