IoT (Nov 2023)
Constraint-Aware Federated Scheduling for Data Center Workloads
Abstract
The use of data centers is ubiquitous, as they support multiple technologies across domains for storing, processing, and disseminating data. IoT applications utilize both cloud data centers and edge data centers based on the nature of the workload. Due to the stringent latency requirements of IoT applications, the workloads are run on hardware accelerators such as FPGAs and GPUs for faster execution. The introduction of such hardware alongside existing variations in the hardware and software configurations of the machines in the data center, increases the heterogeneity of the infrastructure. Optimal job performance necessitates the satisfaction of task placement constraints. This is accomplished through constraint-aware scheduling, where tasks are scheduled on worker nodes with appropriate machine configurations. The presence of placement constraints limits the number of suitable resources available to run a task, leading to queuing delays. As federated schedulers have gained prominence for their speed and scalability, we assess the performance of two such schedulers, Megha and Pigeon, within a constraint-aware context. We extend our previous work on Megha by comparing its performance with a constraint-aware version of the state-of-the-art federated scheduler Pigeon, PigeonC. The results of our experiments with synthetic and real-world cluster traces show that Megha reduces the 99th percentile of job response time delays by a factor of 10 when compared to PigeonC. We also describe enhancements made to Megha’s architecture to improve its scheduling efficiency.
Keywords