IEEE Access (Jan 2020)

Parallelly Running k-Nearest Neighbor Classification Over Semantically Secure Encrypted Data in Outsourced Environments

  • Jeongsu Park,
  • Dong Hoon Lee

DOI
https://doi.org/10.1109/ACCESS.2020.2984579
Journal volume & issue
Vol. 8
pp. 64617 – 64633

Abstract

Read online

Cloud services with powerful resources are popularly used to manage exponentially increasing data and to carry out data mining to analyze the data. However, a data mining involving query can cause privacy problems by disclosing both the data and the query. One task in data mining, classification, is used in a wide range of applications, and we focus on k-nearest neighbor (kNN) in this study to realize classification. Although several studies have already attempted to address the privacy problems associated with kNN computation in a cloud environment, the results of these studies are still inefficient. In this paper, we propose a very efficient and privacy-preserving kNN classification (PkNC) over encrypted data. While the amount of computation (encryptions/decryptions and exponentiations) and communication of the most efficient kNN classification proposed in prior studies is bounded by O(kln), that of the proposed PkNC is bounded by O(ln), where l is the domain size of data and n is the number of data. When conducting experiments with the same dataset, the prior kNN classification took 12.02 to 55.5 minutes but PkNC took 4.16 minutes. Furthermore, since PkNC allows to be carried out in parallel for each data, its performance can be improved extremely if it is carried out on machine to allow more numerous threads. PkNC protects the privacy of dataset, input query including the kNN result, and does not disclose any data access patterns. We propose several protocols to serve as building blocks to construct PkNC and formally prove their security. In particular, we propose efficient protocols that privately find k largest or smallest elements in array.

Keywords