Journal of Biomechanical Science and Engineering (Jul 2014)
A full GPU implementation of a numerical method for simulating capsule suspensions
Abstract
Although boundary element (BE) based methods are highly accurate for simulating capsule suspensions in Stokes flows, computational time has been a major issue, even when only a few capsules are simulated. We propose a full graphics processing unit (GPU) implementation of a numerical method coupling the BE method of fluid mechanics with the finite element method of membrane mechanics. In single GPU computing, the performance achieves 0.12 TFlop/s when computing one capsule (2562 nodes and 5120 elements) and 0.29 TFlop/s for two capsules. The performance increases with the number of capsules, achieving a maximum of 0.59 TFlop/s. We also implement a multi-GPU method with the data communication overlapping the computation. A weak scaling test shows perfect scalability for any number of computational nodes per GPU, indicating that the communication time is completely hidden. For a practical use of the present results, we estimate the computational time required for 10000 time steps. When we simulate one capsule and two capsules on one GPU, only 2.0 and 9.1 minutes are required to complete the simulation, respectively, and a simulation with 256 capsules on 16 GPUs takes 3.8 days.
Keywords