Entropy (May 2013)

A Novel Nonparametric Distance Estimator for Densities with Error Bounds

  • Alexandre R.F. Carvalho,
  • João Manuel R. S. Tavares,
  • Jose C. Principe

DOI
https://doi.org/10.3390/e15051609
Journal volume & issue
Vol. 15, no. 5
pp. 1609 – 1623

Abstract

Read online

The use of a metric to assess distance between probability densities is an important practical problem. In this work, a particular metric induced by an α-divergence is studied. The Hellinger metric can be interpreted as a particular case within the framework of generalized Tsallis divergences and entropies. The nonparametric Parzen’s density estimator emerges as a natural candidate to estimate the underlying probability density function, since it may account for data from different groups, or experiments with distinct instrumental precisions, i.e., non-independent and identically distributed (non-i.i.d.) data. However, the information theoretic derived metric of the nonparametric Parzen’s density estimator displays infinite variance, limiting the direct use of resampling estimators. Based on measure theory, we present a change of measure to build a finite variance density allowing the use of resampling estimators. In order to counteract the poor scaling with dimension, we propose a new nonparametric two-stage robust resampling estimator of Hellinger’s metric error bounds for heterocedastic data. The approach presents very promising results allowing the use of different covariances for different clusters with impact on the distance evaluation.

Keywords