Applied Sciences (Jan 2024)

A Parallel Privacy-Preserving k-Means Clustering Algorithm for Encrypted Databases in Cloud Computing

  • Youngho Song,
  • Hyeong-Jin Kim,
  • Hyun-Jo Lee,
  • Jae-Woo Chang

DOI
https://doi.org/10.3390/app14020835
Journal volume & issue
Vol. 14, no. 2
p. 835

Abstract

Read online

With the development of cloud computing, interest in database outsourcing has recently increased. However, when the database is outsourced, there is a problem in that the information of the data owner is exposed to internal and external attackers. Therefore, in this paper, we propose decimal-based encryption operation protocols that support privacy preservation. The proposed protocols improve the operational efficiency compared with binary-based encryption operation protocols by eliminating the need for repetitive operations based on bit length. In addition, we propose a privacy-preserving k-means clustering algorithm using decimal-based encryption operation protocols. The proposed k-means clustering algorithm utilizes efficient decimal-based protocols that enhance the efficiency of the encryption operations. To provide high query processing performance, we also propose a parallel k-means clustering algorithm that supports thread-based parallel processing by using a random value pool. Meanwhile, a security analysis of both the proposed k-means clustering algorithm and the proposed parallel algorithm was performed to prove their data protection, query protection, and access pattern protection capabilities. Through our performance analysis, the proposed k-means clustering algorithm shows about 10~13 times better performance compared with the existing algorithms.

Keywords