网络与信息安全学报 (Aug 2024)

Efficient algorithms for string pattern matching to protect search content

  • Jianxu WANG, Rui LI, Yin LI

DOI
https://doi.org/10.11959/j.cnki.cn2096-109x.2024061
Journal volume & issue
Vol. 10, no. 4
pp. 159 – 174

Abstract

Read online

Efficient algorithms for online string pattern matching have been deemed crucial for cloud database retrieval. However, the potential leaking of search content has been recognized as a threat to users' privacy. Existing string pattern matching algorithms have not been designed to consider the protection of users' search content. Although searchable encryption schemes can protect users' search content, they have been associated with high index construction costs and inefficient retrieval. Two pattern matching algorithms were proposed to protect users' search content: PMDPF (pattern matching based on distributed point function) and JPMDPF (jumping pattern matching based on distributed point function). In the PMDPF algorithm, a pattern string was provided to protect the search content, and a distributed point function combined with a fingerprint function was utilized to construct true value tables, which were then sent to two independent servers. Character comparison operations were transformed into table lookup operations to conceal the characters that needed to be checked. To enhance search efficiency, a jump pattern matching algorithm (JPMDPF) was proposed, based on the distributed point function. In this approach, character skipping was employed to induce the JPMDPF algorithm, which achieved a significant improvement in search efficiency over the PMDPF algorithm, albeit with the trade-off of revealing more information. On average, the search efficiency of JPMDPF was increased by approximately m-fold, where m represented the length of the pattern string. Concurrently, there was a significant reduction in the probability of misjudgments due to collisions within the fingerprinting function. The experimental results demonstrate that the search efficiency of the PMDPF algorithm is approximately 5% higher than that of the classical algorithm based on the fingerprint function. It outperforms existing searchable encryption schemes. The search time of the PMDPF algorithm is found to be 4.2 times that of JPMDPF when the length of the search content is 4.

Keywords