Geoscientific Model Development (Sep 2011)

FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm

  • P. Hanappe,
  • A. Beurivé,
  • F. Laguzet,
  • L. Steels,
  • N. Bellouin,
  • O. Boucher,
  • Y. H. Yamazaki,
  • T. Aina,
  • M. Allen

DOI
https://doi.org/10.5194/gmd-4-835-2011
Journal volume & issue
Vol. 4, no. 3
pp. 835 – 844

Abstract

Read online

We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to <i>C</i> and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. <br><br> The modified algorithm runs more than 50 times faster on the CELL's <i>Synergistic Processing Element</i> than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60 % of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.