Journal of Cloud Computing: Advances, Systems and Applications (Feb 2019)
Adaptive deduplication of virtual machine images using AKKA stream to accelerate live migration process in cloud environment
Abstract
Abstract Cloud Computing is a paradigm which provides resources to users from its pool based on demand to satisfy their requirements. During this process, many servers are overloaded and underloaded in the cloud environment. Thus, power consumption and load balancing are the major problems and are resolved by live virtual machine (VM) migration. Load balancing is addressed by moving virtual machines from overloaded host to under loaded host and from under loaded host to any other host which is not overloaded called VM migration. If this process is done without power off (Live) the virtual machines then it is called live VM migration. By this process, the issue of power consumption by physical hosts is also resolved. Migrating virtual machines involves virtualized components like storage disks, memory, CPU and networking, the entire state of VM is captured as a collection of data files. These data files are virtual disk files, configuration files, and log files. The virtual disk files take larger memory and take more migration time and hence the downtime. These disk files include many zero pages, similar and redundant pages. Many techniques such as compression, deduplication is used reduce the size of VM disk image file. Compression techniques are not widely used, due to the disadvantage of compression ratio and decompression time. Many researchers hence used deduplication techniques for reducing the VM disk image file in the live migration process. The significance of the research work is to design an adaptive deduplication mechanism for reducing VM disk image file size by performing fixed length and variable length block-level deduplication processes. The Rabin-Karp rolling hash algorithm is used in variable length block-level deduplication. Akka stream is used for streaming the VM disk image files as it is the bulk volume of live data transfer. To reduce the time of the deduplication process, many researchers used multithreading and multi-core technologies. We use multithreading in Akka framework to run the deduplication process concurrently without OutofMemory errors. The experiment results show that we achieved a maximum of 83% overall reduction in image storage space and 89.76% reduction in total migration time are achieved by adaptive deduplication method. 3% improvement in deduplication rate when compared with the existing image management system. The results are significant because when we apply this in the storage of data centres, there are much space savings. The reduction in size is dependent on the dataset was taken and the applications running on the VM.
Keywords