IEEE Access (Jan 2020)

Towards Building Reliable and Cost-Efficient Distributed Storage Systems

  • Yichuan Qi,
  • Dan Feng,
  • Binbing Hou

DOI
https://doi.org/10.1109/ACCESS.2020.3019108
Journal volume & issue
Vol. 8
pp. 157862 – 157877

Abstract

Read online

Reliability and cost are two important targets for distributed storage systems. For many years, numerous schemes have been proposed to improve the reliability or cost of distributed storage systems, and they can be divided into three categories: (1) data redundancy schemes; (2) data placement schemes; and (3) data repair schemes. However, it is still unclear regarding how to build a reliable and cost-efficient distributed storage system, because (i) insufficient considerations on the combinations of different schemes; and (ii) insufficient considerations on failures and recoveries of different subsystems (racks, nodes, disks, and sectors). To measure the reliability and cost caused by different schemes, we design and implement CR-SIM, a Comprehensive Reliability SIMulator for distributed storage systems. It considers various affecting factors, such as the system topology, the data redundancy scheme, the data placement scheme, the data repair scheme, and the failure/recovery models of different subsystems. By using CR-SIM, we conduct various simulation-based experiments, and the experimental results reveal several important findings, which are helpful to build reliable and cost-efficient distributed storage systems. For public use, we have open-sourced our source code at https://github.com/yichuan0707/CR-SIM.

Keywords