IET Computer Vision (Oct 2024)
SkatingVerse: A large‐scale benchmark for comprehensive evaluation on human action understanding
Abstract
Abstract Human action understanding (HAU) is a broad topic that involves specific tasks, such as action localisation, recognition, and assessment. However, most popular HAU datasets are bound to one task based on particular actions. Combining different but relevant HAU tasks to establish a unified action understanding system is challenging due to the disparate actions across datasets. A large‐scale and comprehensive benchmark, namely SkatingVerse is constructed for action recognition, segmentation, proposal, and assessment. SkatingVerse focus on fine‐grained sport action, hence figure skating is chosen as the task object, which eliminates the biases of the object, scene, and space that exist in most previous datasets. In addition, skating actions have inherent complexity and similarity, which is an enormous challenge for current algorithms. A total of 1687 official figure skating competition videos was collected with a total of 184.4 h, exceeding four times over other datasets with a similar topic. SkatingVerse enables to formulate a unified task to output fine‐grained human action classification and assessment results from a raw figure skating competition video. In addition, SkatingVerse can facilitate the study of HAU foundation model due to its large scale and abundant categories. Moreover, image modality is incorporated for human pose estimation task into SkatingVerse. Extensive experimental results show that (1) SkatingVerse significantly helps the training and evaluation of HAU methods, (2) the performance of existing HAU methods has much room to improve, and SkatingVerse helps to reduce such gaps, and (3) unifying relevant tasks in HAU through a uniform dataset can facilitate more practical applications. SkatingVerse will be publicly available to facilitate further studies on relevant problems.
Keywords