Self-Supervised Graph Representation Learning via Information Bottleneck

Junhua Gu; Zichen Zheng; Wenmiao Zhou; Yajuan Zhang; Zhengjun Lu; Liang Yang

doi:10.3390/sym14040657

Symmetry (Mar 2022)

Self-Supervised Graph Representation Learning via Information Bottleneck

Junhua Gu,
Zichen Zheng,
Wenmiao Zhou,
Yajuan Zhang,
Zhengjun Lu,
Liang Yang

Affiliations

Junhua Gu: School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Zichen Zheng: School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Wenmiao Zhou: School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Yajuan Zhang: School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Zhengjun Lu: Defense Engineering Institute AMS, PLA, Beijing 100036, China
Liang Yang: School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China

DOI: https://doi.org/10.3390/sym14040657
Journal volume & issue: Vol. 14, no. 4
p. 657

Abstract

Read online

Graph representation learning has become a mainstream method for processing network structured data, and most graph representation learning methods rely heavily on labeling information for downstream tasks. Since labeled information is rare in the real world, adopting self-supervised learning to solve the graph neural network problem is a significant challenge. Currently, existing graph neural network approaches attempt to maximize mutual information for self-supervised learning, which leads to a large amount of redundant information in the graph representation and thus affects the performance of downstream tasks. Therefore, the self-supervised graph information bottleneck (SGIB) proposed in this paper uses the symmetry and asymmetry of graphs to establish comparative learning and introduces the information bottleneck theory as a loss training model. This model extracts the common features of both views and the independent features of each view by maximizing the mutual information estimation between the local high-level representation of one view and the global summary vector of the other view. It also removes redundant information not relevant to the target task by minimizing the mutual information between the local high-level representations of the two views. Based on the extensive experimental results of three public datasets and two large-scale datasets, it has been shown that the SGIB model can learn higher quality node representations and that several classical network analysis experiments such as node classification and node clustering can be improved compared to existing models in an unsupervised environment. In addition, an in-depth network experiment is designed for in-depth analysis, and the results show that the SGIB model can also alleviate the over-smoothing problem to a certain extent. Therefore, we can infer from different network analysis experiments that it would be an effective improvement of the performance of downstream tasks through introducing information bottleneck theory to remove redundant information.

Published in Symmetry

ISSN: 2073-8994 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/symmetry/

About the journal

Abstract

Keywords