Entropy (Nov 2022)

Communication Efficient Algorithms for Bounding and Approximating the Empirical Entropy in Distributed Systems

  • Amit Shahar,
  • Yuval Alfassi,
  • Daniel Keren

DOI
https://doi.org/10.3390/e24111611
Journal volume & issue
Vol. 24, no. 11
p. 1611

Abstract

Read online

The empirical entropy is a key statistical measure of data frequency vectors, enabling one to estimate how diverse the data are. From the computational point of view, it is important to quickly compute, approximate, or bound the entropy. In a distributed system, the representative (“global”) frequency vector is the average of the “local” frequency vectors, each residing in a distinct node. Typically, the trivial solution of aggregating the local vectors and computing their average incurs a huge communication overhead. Hence, the challenge is to approximate, or bound, the entropy of the global vector, while reducing communication overhead. In this paper, we develop algorithms which achieve this goal.

Keywords