IEEE Access (Jan 2017)

Learning, Visualizing, and Assessing a Model for the Intrinsic Value of a Batted Ball

  • Glenn Healey

DOI
https://doi.org/10.1109/ACCESS.2017.2728663
Journal volume & issue
Vol. 5
pp. 13811 – 13822

Abstract

Read online

We present a multidisciplinary approach for learning, visualizing, and assessing a model for the intrinsic value of a batted ball in baseball. The new methodology addresses one of the most fundamental problems in baseball analytics. Traditional outcome-based statistics for representing player skill on batted balls have been shown to have a low degree of repeatability due to the effects of multiple confounding variables, such as the defense, weather, and ballpark. New sensors have created the opportunity to define batted-ball descriptors that are invariant to these variables. We exploit this opportunity by using a Bayesian model to construct a continuous mapping from a vector of batted-ball parameters to an intrinsic value defined using a linear weights representation for run value. A kernel method is used to learn nonparametric estimates for the component probability density functions in Bayes theorem using a set of over 100 000 batted-ball measurements, while cross validation enables the model to adapt to the size and structure of the data. Properties of the mapping are visualized by considering reduced-dimension subsets of the batted-ball parameter space. The approach separates the intrinsic value of a batted ball at contact from its outcome and, as a result, allows the definition of batted-ball statistics for batters and pitchers that are less subject to systematic bias and random variation than traditional statistics. We use Cronbach's alpha to show that statistics derived from batted-ball intrinsic values have a higher reliability than the traditional outcome-based statistics and that this leads to more accurate estimates of player talent level that can be used for performance forecasting.

Keywords