Jisuanji kexue (Jan 2023)
Fast Computation Graph Simplification via Influence-based Pruning for Graph Neural Network
Abstract
Computation graph simplification is a kind of optimization technique to improve the training speed of graph neural network models.It uses the characteristics of common neighbors among nodes and speeds up the training of graph neural network models by eliminating redundant computation in the stage of aggregation.However,when dealing with large-scale graph data,the existing computation graph simplification techniques suffer from the problem of low computation efficiency,which affects the application of computation graph simplification in large-scale graph neural networks.This paper analyzes the current techniques of computation graph simplification in detail by counting the overhead of two phases including searching and reconstruction,and summarizes the shortcomings of existing techniques.On this basis,it proposes an algorithm of fast computation graph simplification via influence-based pruning for graph neural network.This algorithm applies an influence model to describe the contribution of each node to the computation graph simplification and prunes the searching space of common neighbors based on influence,which greatly improves the efficiency of the phase of searching.In addition,this paper analyzes the algorithm complexity and theoretically proves the expected acceleration effect of this technique.Finally,in order to verify the effectiveness of this novel algorithm,the algorithm is applied to two mainstream computation graph simplification technique,and common graph neural network models areselected to test on some data sets.Experimental results demonstrate that the novel algorithm can significantly improve the efficiency of the computation graph simplification on the premise of ensuring a certain amount of redundant computation reduction.Compared with the baseline of computation graph simplification,the proposed technique can speed up to 3.4 times in searching phase and speed up to 1.6 times on the whole process on PPI dataset,while it can speed up to 5.1 times in searching phase and speed up to 3.2 times on the whole process on Reddit dataset.
Keywords