Simulation of Quantum Many-Body Dynamics with Tensor Processing Units: Floquet Prethermalization

Alan Morningstar; Markus Hauru; Jackson Beall; Martin Ganahl; Adam G.M. Lewis; Vedika Khemani; Guifre Vidal

doi:10.1103/PRXQuantum.3.020331

PRX Quantum (May 2022)

Simulation of Quantum Many-Body Dynamics with Tensor Processing Units: Floquet Prethermalization

Alan Morningstar,
Markus Hauru,
Jackson Beall,
Martin Ganahl,
Adam G.M. Lewis,
Vedika Khemani,
Guifre Vidal

Affiliations

Alan Morningstar
Markus Hauru
Jackson Beall
Martin Ganahl
Adam G.M. Lewis
Vedika Khemani
Guifre Vidal

DOI: https://doi.org/10.1103/PRXQuantum.3.020331
Journal volume & issue: Vol. 3, no. 2
p. 020331

Abstract

Read online Read online

Tensor processing units (TPUs) are specialized hardware accelerators developed by Google to support large-scale machine-learning tasks but they can also be leveraged to accelerate and scale other linear-algebra-intensive computations. In this paper, we demonstrate the usage of TPUs for massively parallel classical simulations of quantum many-body dynamics on long time scales. We apply our methods to study the phenomenon of Floquet prethermalization, i.e., exponentially slow heating in quantum spin chains subject to high-frequency periodic driving. We simulate the dynamics of L=34 qubits for over 10^{5} Floquet periods, corresponding to circuits with 4×10^{6} nearest-neighbor two-qubit gates. The circuits simulated have no additional symmetries and represent a pure-state evolution in the full 2^{L}-dimensional Hilbert space. This is achieved by distributing the computation over 128 TPU cores. On that size TPU cluster, we find speed-ups in wall-clock run time of 230 times and 15 times when compared to reference CPU and single-graphics-processing-unit (GPU) simulations, respectively, for shorter-time 30-qubit simulations that can be handled by all three platforms. We study the computational cost of the simulations, as a function of both the number of qubits and the number of TPU cores used, up to our maximum capacity of L=40 qubits, which requires a “full pod” of 2048 TPU cores with tens of terabytes of memory in total. For these simulations, an eight-TPU-core machine is comparable to a single A100 GPU and thus the full TPU pod is comparable to a machine with hundreds of top-of-the-line GPUs. However, the TPU pod is more energy and cost efficient and readily accessible (via Google Cloud), unlike such large many-GPU configurations. We also study the accumulation of numerical error as a function of circuit depth in very deep circuits. Our work demonstrates that TPUs can offer significant advantages for state-of-the-art simulations of quantum many-body dynamics.

Published in PRX Quantum

ISSN: 2691-3399 (Online)
Publisher: American Physical Society
Country of publisher: United States
LCC subjects: Science: Physics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://journals.aps.org/prxquantum/

About the journal