Adaptive spatial down-sampling method based on object occupancy distribution for video coding for machines

Eun-bin An; Ayoung Kim; Soon-heung Jung; Sangwoon Kwak; Jin Young Lee; Won-Sik Cheong; Hyon-Gon Choo; Kwang-deok Seo

doi:10.1186/s13640-024-00647-y

EURASIP Journal on Image and Video Processing (Oct 2024)

Adaptive spatial down-sampling method based on object occupancy distribution for video coding for machines

Eun-bin An,
Ayoung Kim,
Soon-heung Jung,
Sangwoon Kwak,
Jin Young Lee,
Won-Sik Cheong,
Hyon-Gon Choo,
Kwang-deok Seo

Affiliations

Eun-bin An: Division of Software, Yonsei University
Ayoung Kim: Division of Software, Yonsei University
Soon-heung Jung: Media Research Division, Electronics and Telecommunications Research Institute
Sangwoon Kwak: Media Research Division, Electronics and Telecommunications Research Institute
Jin Young Lee: Media Research Division, Electronics and Telecommunications Research Institute
Won-Sik Cheong: Media Research Division, Electronics and Telecommunications Research Institute
Hyon-Gon Choo: Media Research Division, Electronics and Telecommunications Research Institute
Kwang-deok Seo: Division of Software, Yonsei University

DOI: https://doi.org/10.1186/s13640-024-00647-y
Journal volume & issue: Vol. 2024, no. 1
pp. 1 – 17

Abstract

Read online

Abstract As the performance of machine vision continues to improve, it is being used in various industrial fields to analyze and generate massive amounts of video data. Although the demand for and consumption of video data by machines has increased significantly, video coding for machines needs to be improved. It is therefore necessary to consider a new codec that differs from conventional codecs based on the human visual system (HVS). Spatial down-sampling plays a critical role in video coding for machines because it reduces the volume of the video data to be processed while maintaining the shape of the data’s features that are important for the machine to reference when processing the video. An effective method of determining the intensity of spatial down-sampling as an efficient coding tool for machines is still in the early stages. Here, we propose a method of determining an optimal scale factor for spatial down-sampling by collecting and analyzing information on the number of objects and the ratio of the area occupied by the object within a picture. We compare the data reduction ratio to the machine accuracy error ratio (DRAER) to evaluate the performance of the proposed method. By applying the proposed method, the DRAER was found to be a maximum of 21.40 dB and a minimum of 11.94 dB. This shows that video coding gain for the machines could be achieved through the proposed method while maintaining the accuracy of machine vision tasks.

Published in EURASIP Journal on Image and Video Processing

ISSN: 1687-5176 (Print); 1687-5281 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: https://jivp-eurasipjournals.springeropen.com

About the journal

Abstract

Keywords