IEEE Access (Jan 2021)

Large-Scale Network Community Detection Using Similarity-Guided Merge and Refinement

  • Volkan Tunali

DOI
https://doi.org/10.1109/ACCESS.2021.3083971
Journal volume & issue
Vol. 9
pp. 78538 – 78552

Abstract

Read online

It is possible to extract valuable insights about the functional properties of a system by identifying and inspecting the community structure in the network that models the system. Community detection aims to extract these community structures from networks. Many community detection methods have been proposed that consider the problem from different perspectives. However, with the emergence of very large and complex networks from variety of domains, there has been a growing need for community detection methods that can operate at scale effectively and efficiently. Considering this, we propose a novel algorithm for large-scale community detection, based on two novel similarity indices we propose as well. In the first stage of our proposed algorithm, we generate candidate communities using a mechanism similar to information propagation very rapidly. Then, we merge small candidates that have fewer nodes than a calculated threshold with the larger ones using similarity between nodes and communities. Next, we engage a refinement operation on the candidates by moving all nodes to the candidates to which they are most similar using the same similarity index again. After that, we merge small communities with larger ones by using the similarity between communities until no gain in the modularity is obtained. Finally, in the last stage, we employ the same refinement operation as in the third stage. With an extensive experimentation on real-world and artificially-generated benchmark networks, we demonstrate and verify the performance and effectiveness of the proposed algorithm comparing it with the state-of-the-art methods. Experimental results indicate that our algorithm scales very well with growing size and complexity of networks. Besides, our algorithm outperforms most state-of-the-art community detection methods both in detection performance and computation time.

Keywords