Applied Network Science (Oct 2021)

The interplay between communities and homophily in semi-supervised classification using graph neural networks

  • Hussain Hussain,
  • Tomislav Duricic,
  • Elisabeth Lex,
  • Denis Helic,
  • Roman Kern

DOI
https://doi.org/10.1007/s41109-021-00423-1
Journal volume & issue
Vol. 6, no. 1
pp. 1 – 26

Abstract

Read online

Abstract Graph Neural Networks (GNNs) are effective in many applications. Still, there is a limited understanding of the effect of common graph structures on the learning process of GNNs. To fill this gap, we study the impact of community structure and homophily on the performance of GNNs in semi-supervised node classification on graphs. Our methodology consists of systematically manipulating the structure of eight datasets, and measuring the performance of GNNs on the original graphs and the change in performance in the presence and the absence of community structure and/or homophily. Our results show the major impact of both homophily and communities on the classification accuracy of GNNs, and provide insights on their interplay. In particular, by analyzing community structure and its correlation with node labels, we are able to make informed predictions on the suitability of GNNs for classification on a given graph. Using an information-theoretic metric for community-label correlation, we devise a guideline for model selection based on graph structure. With our work, we provide insights on the abilities of GNNs and the impact of common network phenomena on their performance. Our work improves model selection for node classification in semi-supervised settings.

Keywords