IEEE Access (Jan 2020)
On the Complexity and Performance of the Information Dispersal Algorithm
Abstract
The Information Dispersal Algorithm (IDA) has become a key component in several fault-tolerant massive storage systems. From a theoretical point of view, it is a linear transformation over a finite field on the vectors that make up a given file. Direct transformation adds redundancy, splitting the initial file into a new set of files called dispersals. The inverse transformation recovers the original file from a subset of dispersals. This piece of research demonstrates the impact of input and output (I/O) operations on direct and inverse transformations. Different alternatives to control the exchange of elements between RAM and disk were evaluated, which is the key operation to build a vector in memory and store its entries in a file. First, the impact of the working finite field was tested; second, the impact of the use of a buffer for exchange between the RAM and the hard disk, and finally, several instances of the algorithm with which to evaluate the impact of parallelism were simultaneously deployed. The results demonstrate that the combination of these factors may have an important effect on the speed of both direct and inverse procedures.
Keywords