Efficient Network Representation Learning via Cluster Similarity

Yasuhiro Fujiwara; Yasutoshi Ida; Atsutoshi Kumagai; Masahiro Nakano; Akisato Kimura; Naonori Ueda

doi:10.1007/s41019-023-00222-x

Data Science and Engineering (Sep 2023)

Efficient Network Representation Learning via Cluster Similarity

Yasuhiro Fujiwara,
Yasutoshi Ida,
Atsutoshi Kumagai,
Masahiro Nakano,
Akisato Kimura,
Naonori Ueda

Affiliations

Yasuhiro Fujiwara: NTT Communication Science Labortories
Yasutoshi Ida: NTT Communication Science Labortories
Atsutoshi Kumagai: NTT Communication Science Labortories
Masahiro Nakano: NTT Communication Science Labortories
Akisato Kimura: NTT Communication Science Labortories
Naonori Ueda: NTT Communication Science Labortories

DOI: https://doi.org/10.1007/s41019-023-00222-x
Journal volume & issue: Vol. 8, no. 3
pp. 279 – 291

Abstract

Read online

Abstract Network representation learning is a de facto tool for graph analytics. The mainstream of the previous approaches is to factorize the proximity matrix between nodes. However, if n is the number of nodes, since the size of the proximity matrix is $$n \times n$$ n × n , it needs $$O(n^3)$$ O ( n 3 ) time and $$O(n^2)$$ O ( n 2 ) space to perform network representation learning; they are significantly high for large-scale graphs. This paper introduces the novel idea of using similarities between clusters instead of proximities between nodes; the proposed approach computes the representations of the clusters from similarities between clusters and computes the representations of nodes by referring to them. If l is the number of clusters, since $$l \ll n$$ l ≪ n , we can efficiently obtain the representations of clusters from a small $$l \times l$$ l × l similarity matrix. Furthermore, since nodes in each cluster share similar structural properties, we can effectively compute the representation vectors of nodes. Experiments show that our approach can perform network representation learning more efficiently and effectively than existing approaches.

Published in Data Science and Engineering

ISSN: 2364-1185 (Print); 2364-1541 (Online)
Publisher: SpringerOpen
Country of publisher: Germany
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.springer.com/41019

About the journal

Abstract

Keywords