IEEE Access (Jan 2024)

DEDUCT: A Secure Deduplication of Textual Data in Cloud Environments

  • Kiana Ghassabi,
  • Peyman Pahlevani

DOI
https://doi.org/10.1109/ACCESS.2024.3402544
Journal volume & issue
Vol. 12
pp. 70743 – 70758

Abstract

Read online

The exponential growth of textual data in Vision-and-Language Navigation tasks poses significant challenges for data management in large-scale storage systems. Data deduplication has emerged as a practical strategy for data reduction in large-scale storage systems; however, it has also raised security concerns. This paper introduces DEDUCT, an innovative data deduplication method for textual data. DEDUCT employs a hybrid approach that combines cloud-side and client-side deduplication mechanisms to achieve high compression rates while maintaining data security. DEDUCT’s lightweight preprocessing and client-side deduplication make it suitable for resource-constrained devices like IoT devices. It has also been designed to resist side-channel attacks. Experimental evaluations on the Touchdown dataset, consisting of human-written navigation instructions for routes, demonstrate the effectiveness of DEDUCT. It achieves compression rates of nearly 66%, significantly reducing storage requirements while preserving the confidentiality of textual data. This substantial reduction in storage demands can lead to significant cost savings and improved efficiency in large-scale data management systems.

Keywords