Journal of Systemics, Cybernetics and Informatics (Jun 2009)
Accelerating Image Based Scientific Applications using Commodity Video Graphics Adapters
Abstract
The processing power available in current video graphics cards is approaching super computer levels. State-of-the-art graphical processing units (GPU) boast of computational performance in the range of 1.0-1.1 trillion floating point operations per second (1.0-1.1 Teraflops). Making this processing power accessible to the scientific community would benefit many fields of research. This research takes a relatively computationally expensive image-based iris segmentation algorithm and hosts it on a GPU using the High Level Shader Language which is part of DirectX 9.0. The selected segmentation algorithm uses basic image processing techniques such as image inversion, value squaring, thresholding, dilation, erosion and a computationally intensive local kurtosis (fourth central moment) calculation. Strengths and limitations of the DirectX rendering pipeline are discussed. The primary source of the graphical processing power, the pixel or fragment shader, is discussed in detail. Impressive acceleration results were obtained. The iris segmentation algorithm was accelerated by a factor of 40 over the highly optimized C++ version hosted on the computer's central processing unit. Some parts of the algorithm ran at speeds that were over 100 times faster than their C++ counterpart. GPU programming details and HLSL code samples are presented as part of the acceleration discussion.