Geoscientific Model Development (Jul 2022)

swNEMO_v4.0: an ocean model based on NEMO4 for the new-generation Sunway supercomputer

  • Y. Ye,
  • Y. Ye,
  • Y. Ye,
  • Y. Ye,
  • Z. Song,
  • Z. Song,
  • Z. Song,
  • S. Zhou,
  • S. Zhou,
  • S. Zhou,
  • Y. Liu,
  • Q. Shu,
  • Q. Shu,
  • Q. Shu,
  • B. Wang,
  • W. Liu,
  • W. Liu,
  • F. Qiao,
  • F. Qiao,
  • F. Qiao,
  • L. Wang,
  • L. Wang

DOI
https://doi.org/10.5194/gmd-15-5739-2022
Journal volume & issue
Vol. 15
pp. 5739 – 5756

Abstract

Read online

The current large-scale parallel barrier of ocean general circulation models (OGCMs) makes it difficult to meet the computing demand of high resolution. Fully considering both the computational characteristics of OGCMs and the heterogeneous many-core architecture of the new Sunway supercomputer, swNEMO_v4.0, based on NEMO4 (Nucleus for European Modelling of the Ocean version 4), is developed with ultrahigh scalability. Three innovations and breakthroughs are shown in our work: (1) a highly adaptive, efficient four-level parallelization framework for OGCMs is proposed to release a new level of parallelism along the compute-dependency column dimension. (2) A many-core optimization method using blocking by remote memory access (RMA) and a dynamic cache scheduling strategy is applied, effectively utilizing the temporal and spatial locality of data. The test shows that the actual direct memory access (DMA) bandwidth is greater than 90 % of the ideal band-width after optimization, and the maximum is up to 95 %. (3) A mixed-precision optimization method with half, single and double precision is explored, which can effectively improve the computation performance while maintaining the simulated accuracy of OGCMs. The results demonstrate that swNEMO_v4.0 has ultrahigh scalability, achieving up to 99.29 % parallel efficiency with a resolution of 500 m using 27 988 480 cores, reaching the peak performance with 1.97 PFLOPS.