PeerJ Computer Science (Aug 2023)

Identifying bias in network clustering quality metrics

  • Martí Renedo-Mirambell,
  • Argimiro Arratia

DOI
https://doi.org/10.7717/peerj-cs.1523
Journal volume & issue
Vol. 9
p. e1523

Abstract

Read online Read online

We study potential biases of popular network clustering quality metrics, such as those based on the dichotomy between internal and external connectivity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, and Poisson or scale-free degree distribution, to which quality metrics will be applied. These models also allow us to generate multi-level structures of varying strength, which will show if metrics favour partitions into a larger or smaller number of clusters. Additionally, we propose another quality metric, the density ratio. We observed that most of the studied metrics tend to favour partitions into a smaller number of big clusters, even when their relative internal and external connectivity are the same. The metrics found to be less biased are modularity and density ratio.

Keywords