Benchmarking GPU Tensor Cores on General Matrix Multiplication Kernels through CUTLASS

Xuanteng Huang; Xianwei Zhang; Panfei Yang; Nong Xiao

doi:10.3390/app132413022

Applied Sciences (Dec 2023)

Benchmarking GPU Tensor Cores on General Matrix Multiplication Kernels through CUTLASS

Xuanteng Huang,
Xianwei Zhang,
Panfei Yang,
Nong Xiao

Affiliations

Xuanteng Huang: School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
Xianwei Zhang: School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
Panfei Yang: School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
Nong Xiao: School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China

DOI: https://doi.org/10.3390/app132413022
Journal volume & issue: Vol. 13, no. 24
p. 13022

Abstract

Read online

GPUs have been broadly used to accelerate big data analytics, scientific computing and machine intelligence. Particularly, matrix multiplication and convolution are two principal operations that use a large proportion of steps in modern data analysis and deep neural networks. These performance-critical operations are often offloaded to the GPU to obtain substantial improvements in end-to-end latency. In addition, multifarious workload characteristics and complicated processing phases in big data demand a customizable yet performant operator library. To this end, GPU vendors, including NVIDIA and AMD, have proposed template and composable GPU operator libraries to conduct specific computations on certain types of low-precision data elements. We formalize a set of benchmarks via CUTLASS, NVIDIA’s templated library that provides high-performance and hierarchically designed kernels. The benchmarking results show that, with the necessary fine tuning, hardware-level ASICs like tensor cores could dramatically boost performance in specific operations like GEMM offloading to modern GPUs.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords