Applied Sciences (Aug 2023)

Deriving Ground Truth Labels for Regression Problems Using Annotator Precision

  • Benjamin Johnston,
  • Philip de Chazal

DOI
https://doi.org/10.3390/app13169130
Journal volume & issue
Vol. 13, no. 16
p. 9130

Abstract

Read online

When training machine learning models with practical applications, a quality ground truth dataset is critical. Unlike in classification problems, there is currently no effective method for determining a single ground truth value or landmark from a set of annotations in regression problems. We propose a novel method for deriving ground truth labels in regression problems that considers the performance and precision of individual annotators when identifying each label separately. In contrast to the commonly accepted method of computing the global mean, our method does not assume each annotator to be equally capable of completing the specified task, but rather ensures that higher-performing annotators have a greater contribution to the final result. The ground truth selection method described within this paper provides a means of improving the quality of input data for machine learning model development by removing lower-quality labels. In this study, we objectively demonstrate the improved performance by applying the method to a simulated dataset where a canonical ground truth position can be known, as well as to a sample of data collected from crowd-sourced labels.

Keywords