CAAI Transactions on Intelligence Technology (Apr 2024)

A multi‐feature‐based intelligent redundancy elimination scheme for cloud‐assisted health systems

  • Ling Xiao,
  • Beiji Zou,
  • Xiaoyan Kui,
  • Chengzhang Zhu,
  • Wensheng Zhang,
  • Xuebing Yang,
  • Bob Zhang

DOI
https://doi.org/10.1049/cit2.12211
Journal volume & issue
Vol. 9, no. 2
pp. 491 – 510

Abstract

Read online

Abstract Redundancy elimination techniques are extensively investigated to reduce storage overheads for cloud‐assisted health systems. Deduplication eliminates the redundancy of duplicate blocks by storing one physical instance referenced by multiple duplicates. Delta compression is usually regarded as a complementary technique to deduplication to further remove the redundancy of similar blocks, but our observations indicate that this is disobedient when data have sparse duplicate blocks. In addition, there are many overlapped deltas in the resemblance detection process of post‐deduplication delta compression, which hinders the efficiency of delta compression and the index phase of resemblance detection inquires abundant non‐similar blocks, resulting in inefficient system throughput. Therefore, a multi‐feature‐based redundancy elimination scheme, called MFRE, is proposed to solve these problems. The similarity feature and temporal locality feature are excavated to assist redundancy elimination where the similarity feature well expresses the duplicate attribute. Then, similarity‐based dynamic post‐deduplication delta compression and temporal locality‐based dynamic delta compression discover more similar base blocks to minimise overlapped deltas and improve compression ratios. Moreover, the clustering method based on block‐relationship and the feature index strategy based on bloom filters reduce IO overheads and improve system throughput. Experiments demonstrate that the proposed method, compared to the state‐of‐the‐art method, improves the compression ratio and system throughput by 9.68% and 50%, respectively.

Keywords