Informatică economică (Jan 2012)

Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units

  • Ion LUNGU,
  • Dana-Mihaela PETROSANU,
  • Alexandru PIRJAN

Journal volume & issue
Vol. 16, no. 3
pp. 72 – 86

Abstract

Read online

In this paper, we research, analyze and develop optimization solutions for the parallel reduction function using graphics processing units (GPUs) that implement the Compute Unified Device Architecture (CUDA), a modern and novel approach for improving the software performance of data processing applications and algorithms. Many of these applications and algorithms make use of the reduction function in their computational steps. After having designed the function and its algorithmic steps in CUDA, we have progressively developed and implemented optimization solutions for the reduction function. In order to confirm, test and evaluate the solutions' efficiency, we have developed a custom tailored benchmark suite. We have analyzed the obtained experimental results regarding: the comparison of the execution time and bandwidth when using graphic processing units covering the main CUDA architectures (Tesla GT200, Fermi GF100, Kepler GK104) and a central processing unit; the data type influence; the binary operator's influence.

Keywords