Physical Review X (Mar 2022)
Model-Free Quantum Control with Reinforcement Learning
Abstract
Model bias is an inherent limitation of the current dominant approach to optimal quantum control, which relies on a system simulation for optimization of control policies. To overcome this limitation, we propose a circuit-based approach for training a reinforcement learning agent on quantum control tasks in a model-free way. Given a continuously parametrized control circuit, the agent learns its parameters through trial-and-error interaction with the quantum system, using measurement outcomes as the only source of information about the quantum state. Focusing on control of a harmonic oscillator coupled to an ancilla qubit, we show how to reward the learning agent with measurements of experimentally available observables. We train the agent to prepare various nonclassical states via both unitary control and control with adaptive measurement-based quantum feedback, and to execute logical gates on encoded qubits. The agent does not rely on averaging for state tomography or fidelity estimation, and significantly outperforms widely used model-free methods in terms of sample efficiency. Our numerical work is of immediate relevance to superconducting circuits and trapped ions platforms where such training can be implemented in experiment, allowing complete elimination of model bias and the adaptation of quantum control policies to the specific system in which they are deployed.