Applied Sciences (Feb 2023)

Erasure Codes for Cold Data in Distributed Storage Systems

  • Chao Yin,
  • Zhiyuan Xu,
  • Wei Li,
  • Tongfang Li,
  • Sihao Yuan,
  • Yan Liu

DOI
https://doi.org/10.3390/app13042170
Journal volume & issue
Vol. 13, no. 4
p. 2170

Abstract

Read online

Replication and erasure codes are always used for storing large amounts of data in distributed storage systems. Erasure code technology can maximize the storage space of distributed storage systems as well as guaranteeing their availability and reliability, but it will decrease the performance of the system when encoding and decoding. Since cold data do not require high real-time data availability, we focus on the cold data using erasure codes. We propose a new erasure code process named NewLib code based on the Liberation code, which designs the data alignment after stripping the encoding data. The NewLib code improves the performance of reading and writing in the distributed storage systems. At the same time, we developed a node scheduling scheme called N-Schedule, which divides the data nodes into multiple virtual nodes according to the storage space and computing power. The virtual nodes are dispersed into a hash ring by a consistent hash to construct a fully symmetric and decentralized hash ring in order to achieve uniform data distribution and task scheduling. The experimental results show our scheme can improve the performance of the system.

Keywords