IEEE Access (Jan 2024)

Novel Hardware Implementation of Deduplicating Visually Identical JPEG Image Chunks

  • Thang Luong,
  • Luan Dinh,
  • Hung Nguyen,
  • Linh Tran

DOI
https://doi.org/10.1109/ACCESS.2024.3401153
Journal volume & issue
Vol. 12
pp. 69568 – 69577

Abstract

Read online

The exponential growth of data, particularly JPEG images spurred by mobile and social media applications, poses significant storage challenges for data centers. Traditional deduplication methods such as FSP or CDC rely on binary data, which can differ even when two images are visually identical. PXDedup, a chunk-based image deduplication strategy, effectively addresses this issue by recognizing and eliminating visual redundancies. This approach offers substantial improvements in JPEG image deduplication. However, the software implementation of this method on standard CPUs suffers from low throughput due to the CPUs’ limited processing capabilities. This study introduces an optimized deduplication method that leverages PXDedup to target visual redundancies in JPEG images. Our research focuses on hardware acceleration for image dimensions of $512 \times 512$ pixels. By implementing this approach on the Digilent Genesys 2 board with Xilinx Kintex-7 FPGA, we achieve a performance of 25 MB/s, tenfold faster than single-core CPU implementation and threefold quicker than quad-core CPU setup. Additionally, our analysis indicates minimal disparity in the results across all three datasets in software and hardware implementations. This evaluation underscores the effectiveness of hardware acceleration in enhancing deduplication throughput while ensuring accuracy and reliability.

Keywords