Inclusive random sampling in graphs and networks

Yitzchak Novick; Amotz Bar-Noy

doi:10.1007/s41109-023-00579-y

Applied Network Science (Sep 2023)

Inclusive random sampling in graphs and networks

Yitzchak Novick,
Amotz Bar-Noy

Affiliations

Yitzchak Novick: CUNY Graduate Center
Amotz Bar-Noy: Brooklyn College

DOI: https://doi.org/10.1007/s41109-023-00579-y
Journal volume & issue: Vol. 8, no. 1
pp. 1 – 17

Abstract

Read online

Abstract It is often of interest to sample vertices from a graph with a bias towards higher-degree vertices. One well-known method, which we call random neighbor or RN, involves taking a vertex at random and exchanging it for one of its neighbors. Loosely inspired by the friendship paradox, the method is predicated on the fact that the expected degree of the neighbor is greater than or equal to the expected degree of the initial vertex. Another method that is actually perfectly analogous to the friendship paradox is random edge, or RE, where an edge is sampled at random, and then one of the two endpoint vertices is selected at random. Obviously, random sampling is only required when full knowledge of the graph is unattainable. But, while it is true in most cases that knowledge of all vertices’ degrees cannot be obtained, it is often trivial to learn the degree of specific vertices that have already been isolated. In light of this, we suggest a tweak to both RN and RE, inclusive random sampling. In inclusive random neighbor (IRN) the initial vertex and the selected neighbor are considered, in inclusive random edge (IRE) the two endpoint vertices are, and in both cases, we learn the degree of each and select the vertex of higher degree. This paper explores inclusive random sampling through theoretical analysis and experimentation. We establish meaningful bounds on IRN and IRE’s performances, in particular in comparison to each other and to their exclusive counterparts. Our analyses highlight differences of the original, exclusive versions as well. The results provide practical insight for strategizing a random sampling method, and also highlight graph characteristics that impact the question of which methods will perform strongly in which graphs.

Published in Applied Network Science

ISSN: 2364-8228 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: http://appliednetsci.springeropen.com/

About the journal

Abstract

Keywords