Information (Feb 2025)
Heavy-Tailed Linear Regression and <i>K</i>-Means
Abstract
Most standard machine learning algorithms are formulated with the implicit assumption that empirical data are “well-behaved”. In this work, we consider heavy-tailed data whose underlying distribution does not necessarily possess finite moments. For such a scenario, classical linear regression techniques and the standard K-means algorithm fail. We formulate and validate heavy-tailed versions of these machine learning methods for both scalar and multidimensional settings. The new algorithms are based on recently defined appropriate location and power parameters. Additionally, we showcase the enhanced performance of the proposed methods in comparison to some other tailored ones found in the literature.
Keywords