IEEE Access (Jan 2020)

Privacy-Preserving Substring Search on Multi-Source Encrypted Gene Data

  • Shiyue Qin,
  • Fucai Zhou,
  • Zongye Zhang,
  • Zifeng Xu

DOI
https://doi.org/10.1109/ACCESS.2020.2980375
Journal volume & issue
Vol. 8
pp. 50472 – 50484

Abstract

Read online

Substring searching on gene sequence data is widely used for analyzing the association between a list of gene mutations and a specific disease. As substring search usually has a high computational cost, deploying it in the cloud has become a popular solution. Moreover, the cloud allows easier data sharing among medical organizations. Since the gene data contains private information, medical organizations usually outsource encrypted gene data to the cloud and selectively share with others. Most existing solutions for the privacy-preserving substring search problem are based on searchable encryption. However, they mainly focus on single-source gene data, not suitable for handling multi-source gene data because of some practical weaknesses. In this paper, we propose a cryptographic scheme that supports privacy-preserving substring search on multi-source encrypted gene data. The cloud can authorize queriers with access control and perform substring searches over the multi-source encrypted gene data for them. Despite the outsourced gene data is encrypted with different keys, but the authorized querier can issue a substring search only uses its own key. We adopt the composite order bilinear map as the primary underlying cryptographic primitive of our scheme. We mainly focus on protecting the privacy of the outsourced gene data and the substring queries, and we provide a security analysis under the honest-but-curious model. We also perform experiments on different datasets and analyze the experimental results in terms of computation cost and communication cost. The analyses show that our scheme is secure and efficient for substring search on multi-source encrypted gene data.

Keywords