Hangkong bingqi (Feb 2024)

Research on the Development Status of High Speed Interconnection Technologies and Topologies of Multi-GPU Systems

  • Cui Chen, Wu Di, Tao Yerong, Zhao Yanli

DOI
https://doi.org/10.12132/ISSN.1673-5048.2023.0138
Journal volume & issue
Vol. 31, no. 1
pp. 23 – 31

Abstract

Read online

Multi GPU systems achieve performance improvement through scaling out to meet the ever-increasing computation demand brought about by increasingly complex algorithms and the continuously increasing data in artificial intelligence. The interconnection bandwidth between processors, as well as topologies of systems are the key factors that determine the performance of multi-GPU systems. In traditional PCIe-based multi-GPU systems, the PCIe bandwidth is the bottleneck that limits system performance. GPU-oriented high speed interconnection technologies become an effective method to solve the bandwidth limitation problem of multi-GPU systems at present. This article first introduces the PCIe interconnection technology and the typical topologies used in traditional multi-GPU systems. Then taking Nvidia NVLink, AMD Infinity Fabric Link, Intel Xe Link, and Biren Technology BLink as examples, GPU-oriented high speed interconnection technologies and topologies of representative GPU vendors at home and abroad are reviewed and analyzed. Finally, the research implication of interconnection technologies is discussed.

Keywords