IEEE Access (Jan 2018)

LDFS: A Low Latency In-Line Data Deduplication File System

  • Yongtao Zhou,
  • Yuhui Deng,
  • Laurence T. Yang,
  • Ru Yang,
  • Lei Si

DOI
https://doi.org/10.1109/ACCESS.2018.2800763
Journal volume & issue
Vol. 6
pp. 15743 – 15753

Abstract

Read online

Due to the rapid proliferation of sensors and intelligent devices, the cyber-physical-social computing and networking (CPSCN) is emerging as a new computing paradigm. Massive data have been generated in the CPSCN environment. The traditional data deduplication is not able to handle the CPSCN environment due to the involved long latency. This paper presents a low latency in-line data deduplication file system (LDFS). The LDFS decouples the unique data block and fingerprint index by writing the address of data blocks to the corresponding file recipe and fingerprint index, thus avoiding accessing fingerprint index on the path of the read operation. For every unique data block, the LDFS assigns a globally unique ID, and thus, the LDFS only requires one disk access to obtain the corresponding data block reference count using the global ID. In order to guarantee the write performance, the LDFS employs finer granularity lock to optimize the block flushing strategy of write buffer. Experimental results demonstrate that the LDFS significantly enhances the read and write performance on the critical path in contrast to the traditional deduplication file system LessFS. Meanwhile, the LDFS achieves almost the same deduplication ratio (40.8) as that of LessFS.

Keywords