Nauka i Obrazovanie (Jan 2015)

Using CUDA Technology for Defining the Stiffness Matrix in the Subspace of Eigenvectors

  • Yu. V. Berchun,
  • I. A. Kiselev,
  • A. S. Hahalin,
  • E. A. Sycheva,
  • T. D. Petrova,
  • V. E. Yablokov

DOI
https://doi.org/10.7463/0715.0791286
Journal volume & issue
Vol. 0, no. 7
pp. 129 – 145

Abstract

Read online

The aim is to improve the performance of solving a problem of deformable solid mechanics through the use of GPGPU. The paper describes technologies for computing systems using both a central and a graphics processor and provides motivation for using CUDA technology as the efficient one.The paper also analyses methods to solve the problem of defining natural frequencies and design waveforms, i.e. an iteration method in the subspace. The method includes several stages. The paper considers the most resource-hungry stage, which defines the stiffness matrix in the subspace of eigenforms and gives the mathematical interpretation of this stage.The GPU choice as a computing device is justified. The paper presents an algorithm for calculating the stiffness matrix in the subspace of eigenforms taking into consideration the features of input data. The global stiffness matrix is very sparse, and its size can reach tens of millions. Therefore, it is represented as a set of the stiffness matrices of the single elements of a model. The paper analyses methods of data representation in the software and selects the best practices for GPU computing.It describes the software implementation using CUDA technology to calculate the stiffness matrix in the subspace of eigenforms. Due to the input data nature, it is impossible to use the universal libraries of matrix computations (cuSPARSE and cuBLAS) for loading the GPU. For efficient use of GPU resources in the software implementation, the stiffness matrices of elements are built in the block matrices of a special form. The advantages of using shared memory in GPU calculations are described.The transfer to the GPU computations allowed a twentyfold increase in performance (as compared to the multithreaded CPU-implementation) on the model of middle dimensions (degrees of freedom about 2 million). Such an acceleration of one stage speeds up defining the natural frequencies and waveforms by the iteration method in a subspace up to times.

Keywords