Applied Network Science (Nov 2023)

Using a Bayesian approach to reconstruct graph statistics after edge sampling

  • Naomi A. Arnold,
  • Raúl J. Mondragón,
  • Richard G. Clegg

DOI
https://doi.org/10.1007/s41109-023-00574-3
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and three real datasets with diverse sizes, degree distributions, degree-degree correlations and triangle count distributions.

Keywords