Machine Learning and Knowledge Extraction (Sep 2022)
Comparison of Imputation Methods for Missing Rate of Perceived Exertion Data in Rugby
Abstract
Rate of perceived exertion (RPE) is used to calculate athlete load. Incomplete load data, due to missing athlete-reported RPE, can increase injury risk. The current standard for missing RPE imputation is daily team mean substitution. However, RPE reflects an individual’s effort; group mean substitution may be suboptimal. This investigation assessed an ideal method for imputing RPE. A total of 987 datasets were collected from women’s rugby sevens competitions. Daily team mean substitution, k-nearest neighbours, random forest, support vector machine, neural network, linear, stepwise, lasso, ridge, and elastic net regression models were assessed at different missingness levels. Statistical equivalence of true and imputed scores by model were evaluated. An ANOVA of accuracy by model and missingness was completed. While all models were equivalent to the true RPE, differences by model existed. Daily team mean substitution was the poorest performing model, and random forest, the best. Accuracy was low in all models, affirming RPE as multifaceted and requiring quantification of potentially overlapping factors. While group mean substitution is discouraged, practitioners are recommended to scrutinize any imputation method relating to athlete load.
Keywords