IEEE Access (Jan 2020)

Learning and Applying a Function Over Distributions

  • Glenn Healey,
  • Shiyuan Zhao

DOI
https://doi.org/10.1109/ACCESS.2020.3024699
Journal volume & issue
Vol. 8
pp. 172196 – 172203

Abstract

Read online

We present a method for learning a function over distributions. The method is based on generalizing nonparametric kernel regression by using the earth mover's distance as a metric for distribution space. The technique is applied to the problem of learning the dependence of pitcher performance in baseball on multidimensional pitch distributions that are controlled by the pitcher. The distributions are derived from sensor measurements that capture the physical properties of each pitch. Finding this dependence allows the recovery of optimal pitch frequencies for individual pitchers. This application is amenable to the use of signatures to represent the distributions and a whitening step is employed to account for the correlations and variances of the pitch variables. Cross validation is used to optimize the kernel smoothing parameter. A set of experiments demonstrates that the new method accurately predicts changes in pitcher performance in response to changes in pitch distribution and also outperforms an existing technique for this application.

Keywords