Applied Sciences (Sep 2020)

World-Models for Bitrate Streaming

  • Harrison Brown,
  • Kai Fricke,
  • Eiko Yoneki

DOI
https://doi.org/10.3390/app10196685
Journal volume & issue
Vol. 10, no. 19
p. 6685

Abstract

Read online

Adaptive bitrate (ABR) algorithms optimize the quality of streaming experiences for users in client-side video players, especially in unreliable or slow mobile networks. Several rule-based heuristic algorithms can achieve stable performance, but they sometimes fail to properly adapt to changing network conditions. Fluctuating bandwidth may cause algorithms to default to behavior that creates a negative experience for the user. ABR algorithms can be generated with reinforcement learning, a decision-making paradigm in which an agent learns to make optimal choices through interactions with an environment. Training reinforcement learning algorithms for bitrate streaming requires building a simulator for an agent to experience interactions quickly; training an agent in the real environment is infeasible due to the long step times in real environments. This project explores using supervised learning to construct a world-model, or a learned simulator, from recorded interactions. A reinforcement learning agent that is trained inside of the learned model, rather than a simulator, can outperform rule-based heuristics. Furthermore, agents that are trained inside the learned world-model can outperform model-free agents in low sample regimes. This work highlights the potential for world-models to quickly learn simulators, and to be used for generating optimal policies.

Keywords