Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

Jan Masek; Radim Burget; Lukas Povoda; Malay Kishore Dutta

doi:10.11601/ijates.v5i2.142

International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems (Jun 2016)

Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

Jan Masek,
Radim Burget,
Lukas Povoda,
Malay Kishore Dutta

Affiliations

Jan Masek: BurgSys, a.s.
Radim Burget: Brno University of Technology
Lukas Povoda: Brno University of Technology
Malay Kishore Dutta: Amity University

DOI: https://doi.org/10.11601/ijates.v5i2.142
Journal volume & issue: Vol. 5, no. 2
pp. 101 – 107

Abstract

Read online

Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high–performance computation capabilities with a good price. This paper deals with a multi–GPU OpenCL and CUDA implementations of k–Nearest Neighbor (k–NN) algorithm. This work compares performances of OpenCLand CUDA implementations where each of them is suitable for different number of used attributes. The proposed CUDA algorithm achieves acceleration up to 880x in comparison witha single thread CPU version. The common k-NN was modified to be faster when the lower number of k neighbors is set. The performance of algorithm was verified with two GPUs dual-core NVIDIA GeForce GTX 690 and CPU Intel Core i7 3770 with 4.1 GHz frequency. The results of speed up were measured for one GPU, two GPUs, three and four GPUs. We performed several tests with data sets containing up to 4 million elements with various number of attributes.

Published in International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems

ISSN: 1805-5443 (Online)
Publisher: International Science and Engineering Society, o.s.
Country of publisher: Czechia
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://www.ijates.org

About the journal