Tehnički Vjesnik (Jan 2020)

Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data

  • Xin Ye,
  • Zhengjun Zhai,
  • Xiaochang Li

DOI
https://doi.org/10.17559/TV-20191219154709
Journal volume & issue
Vol. 27, no. 2
pp. 368 – 373

Abstract

Read online

Solid-state disk (SSD) deduplication refers to the identification and deletion of duplicate data stored in an SSD. The reliability of SSDs is improved by deduplication. At present, the common data deduplication of SSDs is based on online data deduplication with Field Programmable Gate Array (FPGA) acceleration. The disadvantage is that FPGA, which has a complex structure. An off-line deduplication method for the SSD based on hot and cold data was proposed in this study to simplify the structure of an SSD deduplication, reduce the cost, and improve the efficiency of deduplication and access performance of SSDs. First, the wear-leveling algorithm was employed in the SSD to divide the data into cold and hot. Then, the corresponding fingerprint was generated for the cold data. Second, the fingerprint was compared, and the cold data with the same fingerprint were deleted. Finally, the cold and hot data were exchanged after deduplication. Results demonstrate that the duplicate recognition rate of the proposed method is 5% - 38%, which is close to that of the online deduplication method. In terms of access performance, the performance of SSDs using the proposed method is improved by 20% compared with that of traditional SSDs and is near the access performance of SSDs using online deduplication. This study provides certain reference for improving the reliability of existing SSDs.

Keywords