Short video preloading via domain knowledge assisted deep reinforcement learning

Yuhong Xie; Yuan Zhang; Tao Lin; Zipeng Pan; Si-Ze Qian; Bo Jiang; Jinyao Yan

Digital Communications and Networks (Dec 2024)

Short video preloading via domain knowledge assisted deep reinforcement learning

Yuhong Xie,
Yuan Zhang,
Tao Lin,
Zipeng Pan,
Si-Ze Qian,
Bo Jiang,
Jinyao Yan

Affiliations

Yuhong Xie: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Yuan Zhang: State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
Tao Lin: State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China; Corresponding author.
Zipeng Pan: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Si-Ze Qian: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Bo Jiang: School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Jinyao Yan: State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China

Journal volume & issue: Vol. 10, no. 6
pp. 1826 – 1836

Abstract

Read online

Short video applications like TikTok have seen significant growth in recent years. One common behavior of users on these platforms is watching and swiping through videos, which can lead to a significant waste of bandwidth. As such, an important challenge in short video streaming is to design a preloading algorithm that can effectively decide which videos to download, at what bitrate, and when to pause the download in order to reduce bandwidth waste while improving the Quality of Experience (QoE). However, designing such an algorithm is non-trivial, especially when considering the conflicting objectives of minimizing bandwidth waste and maximizing QoE. In this paper, we propose an end-to-end Deep reinforcement learning framework with Action Masking called DAM that leverages domain knowledge to learn an optimal policy for short video preloading. To achieve this, we introduce a reward shaping technique to minimize bandwidth waste and use action masking to make actions more reasonable, reduce playback rebuffering, and accelerate the training process. We have conducted extensive experiments using real-world video datasets and network traces including 4G/WiFi/5G. Our results show that DAM improves the QoE score by 3.73%-11.28% compared to state-of-the-art algorithms, and achieves an average bandwidth waste of only 10.27%-12.07%, outperforming all baseline methods.

Published in Digital Communications and Networks

ISSN: 2352-8648 (Online)
Publisher: KeAi Communications Co., Ltd.
Country of publisher: China
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.keaipublishing.com/en/journals/digital-communications-and-networks/

About the journal

Abstract

Keywords