IEEE Access (Jan 2023)
Accelerating Parallel Applications Based on Graph Reordering for Random Network Topologies
Abstract
The Message Passing Interface (MPI) is a crucial programming tool for enabling communication between processes in parallel applications. The goal of MPI users is to allocate tasks to processors in a way that maximizes both spatial and temporal locality in the network. However, this can be challenging, especially in large-scale networks where maximizing processor locality may not be feasible at runtime. To address this issue, we propose the use of Hamorder, an offline node reassignment approach that takes into account physical processor locations based on graph reordering for Random network topologies. Hamorder aims to optimize task mapping for improved performance in parallel applications, whether for multiple tasks or within a single task. Additionally, we investigate the potential of improving MPI applications through runtime parameter tuning based on Hamorder. Our evaluation results show that Hamorder provides a 27.3% improvement in performance compared to the Gorder algorithm on Random topologies, which is a state-of-the-art solution designed with the aim of enhancing cache locality and achieves this goal by rearranging the vertices of a graph in a way that places the vertices that are typically accessed together in close proximity. Moreover, our autotuning framework using Hamorder results in an average speedup of $1.38\times $ for targeted MPI applications by searching through various runtime parameter combinations.
Keywords