IEEE Access (Jan 2018)
Block Placement in Distributed File Systems Based on Block Access Frequency
Abstract
This paper proposes a new data placement policy to allocate data blocks across storage servers of the distributed/parallel file systems, for yielding even block access workload distribution. To this end, we first analyze the history of block access sequence of a specific application and then introduce a k-partition algorithm to divide data blocks into multiple groups, by referring their access frequency. After that, each group has almost the same access workloads, and we can thus distribute these block groups onto storage servers of the distributed file system, to achieve the goal of uniformly assigning data blocks when running the application. In summary, this newly proposed data placement policy can yield not only an even data distribution but also the block data access balance. The experimental results show that the proposed scheme can greatly reduce I/O time and better improve utilization of storage servers when running the database-relevant applications, compared with the commonly used block data placement strategy, i.e., the round-robin placement policy.
Keywords