IEEE Access (Jan 2019)
A Fast Approach to Scale Up Disk Arrays With Parity Declustered Data Layout by Minimizing Data Migration
Abstract
Parity declustering is widely deployed in erasure-coded storage systems so as to provide fast recovery and high data availability. However, to perform scaling on such redundant array of inexpensive disks (RAIDs), it is necessary to preserve parity declustered data layout so as to preserve the properties after scaling. Unfortunately, existing scaling algorithms fail to achieve this goal so they cannot be applied for scaling RAIDs with parity declustering. To address this challenge, we develop an efficient online scaling scheme called parity declustering scaling (PDS), which employs an auxiliary balanced incomplete block design to define the data migration so as to preserve parity declustered data layout. Furthermore, PDS can also be applied to scale RAIDs for improving reliability and/or storage efficiency as options by allocating more parity blocks and/or data blocks in stripes. We provide theoretical proofs to formally show that PDS preserves parity declustered data layout, and achieves uniform distributions of data and parity blocks after scaling while requiring only the minimal data migration. We implement PDS in Linux kernel 3.14.72 and evaluate its performance with real-world traces. The results show that PDS can reduce 82.37 percent of scaling time and 18.25 percent of user response time during scaling on average, compared with “moving-everything” round-robin approach adapted to achieve parity declustered data layout after scaling.
Keywords