Quantum-Enhanced Reinforcement Learning for Finite-Episode Games with Discrete State Spaces

Florian Neukart; David Von Dollen; Christian Seidel; Gabriele Compostella

doi:10.3389/fphy.2017.00071

Frontiers in Physics (Feb 2018)

Quantum-Enhanced Reinforcement Learning for Finite-Episode Games with Discrete State Spaces

Florian Neukart,
David Von Dollen,
Christian Seidel,
Gabriele Compostella

Affiliations

Florian Neukart: Volkswagen Group of America, Herndon, VA, United States
David Von Dollen: Volkswagen Group of America, Herndon, VA, United States
Christian Seidel: Volkswagen Data:Lab, Wolfsburg, Germany
Gabriele Compostella: Volkswagen Data:Lab, Wolfsburg, Germany

DOI: https://doi.org/10.3389/fphy.2017.00071
Journal volume & issue: Vol. 5

Abstract

Read online

Quantum annealing algorithms belong to the class of metaheuristic tools, applicable for solving binary optimization problems. Hardware implementations of quantum annealing, such as the quantum annealing machines produced by D-Wave Systems [1], have been subject to multiple analyses in research, with the aim of characterizing the technology's usefulness for optimization and sampling tasks [2–16]. Here, we present a way to partially embed both Monte Carlo policy iteration for finding an optimal policy on random observations, as well as how to embed n sub-optimal state-value functions for approximating an improved state-value function given a policy for finite horizon games with discrete state spaces on a D-Wave 2000Q quantum processing unit (QPU). We explain how both problems can be expressed as a quadratic unconstrained binary optimization (QUBO) problem, and show that quantum-enhanced Monte Carlo policy evaluation allows for finding equivalent or better state-value functions for a given policy with the same number episodes compared to a purely classical Monte Carlo algorithm. Additionally, we describe a quantum-classical policy learning algorithm. Our first and foremost aim is to explain how to represent and solve parts of these problems with the help of the QPU, and not to prove supremacy over every existing classical policy evaluation algorithm.

Published in Frontiers in Physics

ISSN: 2296-424X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Physics
Website: https://www.frontiersin.org/journals/physics

About the journal

Abstract

Keywords