Data Science and Engineering (Sep 2023)
Efficient Network Representation Learning via Cluster Similarity
Abstract
Abstract Network representation learning is a de facto tool for graph analytics. The mainstream of the previous approaches is to factorize the proximity matrix between nodes. However, if n is the number of nodes, since the size of the proximity matrix is $$n \times n$$ n × n , it needs $$O(n^3)$$ O ( n 3 ) time and $$O(n^2)$$ O ( n 2 ) space to perform network representation learning; they are significantly high for large-scale graphs. This paper introduces the novel idea of using similarities between clusters instead of proximities between nodes; the proposed approach computes the representations of the clusters from similarities between clusters and computes the representations of nodes by referring to them. If l is the number of clusters, since $$l \ll n$$ l ≪ n , we can efficiently obtain the representations of clusters from a small $$l \times l$$ l × l similarity matrix. Furthermore, since nodes in each cluster share similar structural properties, we can effectively compute the representation vectors of nodes. Experiments show that our approach can perform network representation learning more efficiently and effectively than existing approaches.
Keywords