Entropy (Mar 2023)

Rate Distortion Theory for Descriptive Statistics

  • Peter Harremoës

DOI
https://doi.org/10.3390/e25030456
Journal volume & issue
Vol. 25, no. 3
p. 456

Abstract

Read online

Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning “descriptive confidence regions” to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems.

Keywords