IEEE Access (Jan 2019)
Application-Oriented Network Scheduling With Metaflow
Abstract
Distributed applications usually feature a set of correlated flows between two consecutive computation stages. The scheduling of these flows has a crucial influence on job completion time. Coflow improves performance by optimizing the finish time of the entire set of flows. However, the flows and computing tasks in one application have more complex relationships that exceed the coflow's barrier assumption. In this context, scheduling via coflow abstraction may hurt application performance. Accordingly, we propose metaflow, a traffic abstraction derived from the computation graph of the application. Metaflow reveals the detailed flow requirements of the application and makes it easier to reduce the job completion time. Based on the metaflow, we first develop a mathematical model and formulate the scheduling problem as an integer linear programming (ILP) problem. We further prove that it has an equivalent linear programming (LP) problem through rigorous theoretical analysis in order to solve this ILP problem efficiently. To demonstrate the effectiveness of scheduling with metaflow, we have conducted extensive simulations with both synthetic single jobs and production traces containing multiple jobs. The simulation results verify that our new scheduler adapts well to different jobs and can achieve a significant increase in an average speed of 2.87× on a real-life workload, compared to the state-of-the-art coflow scheduler.
Keywords