Applied Sciences (Nov 2023)
A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment
Abstract
The field of cloud computing has witnessed tremendous progress, with commercial cloud providers offering powerful distributed infrastructures to small and medium enterprises (SMEs) through their revolutionary pay-as-you-go model. Simultaneously, the rise of containers has empowered virtualisation, providing orchestration technologies for the deployment and management of large-scale distributed systems across different geolocations and providers. Big data is another research area which has developed at an extraordinary pace as industries endeavour to discover innovative and effective ways of processing large volumes of structured, semi-structured, and unstructured data. This research aims to integrate the latest advances within the fields of cloud computing, virtualisation, and big data for a systematic approach to stream processing. The novel contributions of this research are: (1) MC-BDP, a reference architecture for big data stream processing in a containerised, multi-cloud environment; (2) a case study conducted with the Estates and Sustainability departments at Leeds Beckett University to evaluate an MC-BDP prototype within the context of energy efficiency for smart buildings. The study found that MC-BDP is scalable and fault-tolerant across cloud environments, key attributes for SMEs managing resources under budgetary constraints. Additionally, our experiments on technology agnosticism and container co-location provide new insights into resource utilisation, cost implications, and optimal deployment strategies in cloud-based big data streaming, offering valuable guidelines for practitioners in the field.
Keywords