PLoS ONE (Jan 2020)
Binary-level data dependence analysis of hot execution regions using abstract interpretation at runtime.
Abstract
With the widespread of multicore systems, automatic parallelization becomes more pronounced, particularly for legacy programs, where the source code is not generally available. An essential operation in any parallelization system is detecting data dependence among parallelization candidate instructions. Conducting dependence analysis at the binary-level is more challenging than that at the source-level due to the much lower semantics of the binary code. In this paper, we consider using the elaborate 'static' analysis of abstract interpretation, for the first time, at runtime for data dependence detection. Specifically, our system interprets instructions at a hot region, while at the same time, collect programs semantics for seen program points, thereby conducting abstract interpretation analysis dynamically. The analysis is guaranteed to be correct as long as execution does not exit the region prematurely. Moreover, successive hot region re-entries will resume previous analysis, albeit much faster in case no major change in the program semantics. Such approach provides for more rigorous analysis than other simple dynamic analysis which would typically miss parallelization opportunities. The proposed approach also does not require any hardware support, availability of the source code, as well as any code re-compilation. To study the performance and accuracy of our approach, we have extended the Padrone dynamic code modification framework, and conduct an initial study on a set of PolyBench kernels and selected programs from SPEC CPU. Experimental results show accurate dependence detection with low overhead.