IEEE Access (Jan 2022)

Reducing the Label Space a Predefined Ratio for a More Efficient Multilabel Classification

  • Jose M. Moyano,
  • Jose M. Luna,
  • Sebastian Ventura

DOI
https://doi.org/10.1109/ACCESS.2022.3192642
Journal volume & issue
Vol. 10
pp. 76480 – 76492

Abstract

Read online

The multi-label classification task has been widely used to solve problems where each of the instances may be related not only to one class but to many of them simultaneously. Many of these problems usually comprise a high number of labels in the output space, so learning a predictive model from such datasets may turn into a challenging task since the computational complexity of most algorithms depends on the number of labels. In this paper, we propose a methodology to reduce the label space a user predefined ratio of labels, aiming to improve the runtime of the multi-label classification algorithms. Obviously, such reduction should be done without producing a significant drop in their final predictive performance. The experimental analysis carried out over 25 well-known multi-label datasets, demonstrates a drastic reduction in the runtime. Besides, it is statistically proven that reducing 20% the number of labels does not lead to a decrease in the predictive performance of the multi-label algorithms using four well-known evaluation measures. Even more, in many cases, although reductions of up to 50% of the output space are made, the predictive performance of the algorithms is not significantly different from using the whole set of labels.

Keywords