GPU-HADVPPM V1.0: a high-efficiency parallel GPU design of the piecewise parabolic method (PPM) for horizontal advection in an  air quality model (CAMx V6.10)

K. Cao; Q. Wu; L. Wang; N. Wang; H. Cheng; X. Tang; D. Li; L. Wang

doi:10.5194/gmd-16-4367-2023

Geoscientific Model Development (Aug 2023)

GPU-HADVPPM V1.0: a high-efficiency parallel GPU design of the piecewise parabolic method (PPM) for horizontal advection in an air quality model (CAMx V6.10)

K. Cao,
Q. Wu,
L. Wang,
N. Wang,
H. Cheng,
X. Tang,
D. Li,
L. Wang

Affiliations

K. Cao: College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
Q. Wu: College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
L. Wang: Henan Ecological Environment Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450000, China
N. Wang: Henan Ecological Environment Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450000, China
H. Cheng: College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
X. Tang: State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry, Institute of Atmospheric Physics, Chinese Academy of Science, Beijing 100029, China
D. Li: College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
L. Wang: College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China

DOI: https://doi.org/10.5194/gmd-16-4367-2023
Journal volume & issue: Vol. 16
pp. 4367 – 4383

Abstract

Read online

With semiconductor technology gradually approaching its physical and thermal limits, graphics processing units (GPUs) are becoming an attractive solution for many scientific applications due to their high performance. This paper presents an application of GPU accelerators in an air quality model. We demonstrate an approach that runs a piecewise parabolic method (PPM) solver of horizontal advection (HADVPPM) for the air quality model CAMx on GPU clusters. Specifically, we first convert the HADVPPM to a new Compute Unified Device Architecture C (CUDA C) code to make it computable on the GPU (GPU-HADVPPM). Then, a series of optimization measures are taken, including reducing the CPU–GPU communication frequency, increasing the data size computation on the GPU, optimizing the GPU memory access, and using thread and block indices to improve the overall computing performance of the CAMx model coupled with GPU-HADVPPM (named the CAMx-CUDA model). Finally, a heterogeneous, hybrid programming paradigm is presented and utilized with GPU-HADVPPM on the GPU clusters with a message passing interface (MPI) and CUDA. The offline experimental results show that running GPU-HADVPPM on one NVIDIA Tesla K40m and an NVIDIA Tesla V100 GPU can achieve up to a 845.4× and 1113.6× acceleration. By implementing a series of optimization schemes, the CAMx-CUDA model results in a 29.0× and 128.4× improvement in computational efficiency by using a GPU accelerator card on a K40m and V100 cluster, respectively. In terms of the single-module computational efficiency of GPU-HADVPPM, it can achieve 1.3× and 18.8× speedup on an NVIDIA Tesla K40m GPU and NVIDIA Tesla V100 GPU, respectively. The multi-GPU acceleration algorithm enables a 4.5× speedup with eight CPU cores and eight GPU accelerators on a V100 cluster.

Published in Geoscientific Model Development

ISSN: 1991-959X (Print); 1991-9603 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Science: Geology
Website: https://www.geoscientific-model-development.net/

About the journal