SkatingVerse: A large‐scale benchmark for comprehensive evaluation on human action understanding

Ziliang Gan; Lei Jin; Yi Cheng; Yu Cheng; Yinglei Teng; Zun Li; Yawen Li; Wenhan Yang; Zheng Zhu; Junliang Xing; Jian Zhao

doi:10.1049/cvi2.12287

IET Computer Vision (Oct 2024)

SkatingVerse: A large‐scale benchmark for comprehensive evaluation on human action understanding

Ziliang Gan,
Lei Jin,
Yi Cheng,
Yu Cheng,
Yinglei Teng,
Zun Li,
Yawen Li,
Wenhan Yang,
Zheng Zhu,
Junliang Xing,
Jian Zhao

Affiliations

Ziliang Gan: Beijing University of Posts and Telecommunications Beijing China
Lei Jin: Beijing University of Posts and Telecommunications Beijing China
Yi Cheng: I2R, ASTAR Singapore Singapore
Yu Cheng: National University of Singapore Singapore Singapore
Yinglei Teng: Beijing University of Posts and Telecommunications Beijing China
Zun Li: Beijing University of Technology Beijing China
Yawen Li: Beijing University of Posts and Telecommunications Beijing China
Wenhan Yang: Peng Cheng Laboratory ShenZhen China
Zheng Zhu: Tsinghua University Beijing China
Junliang Xing: Tsinghua University Beijing China
Jian Zhao: Peng Cheng Laboratory ShenZhen China

DOI: https://doi.org/10.1049/cvi2.12287
Journal volume & issue: Vol. 18, no. 7
pp. 888 – 906

Abstract

Read online

Abstract Human action understanding (HAU) is a broad topic that involves specific tasks, such as action localisation, recognition, and assessment. However, most popular HAU datasets are bound to one task based on particular actions. Combining different but relevant HAU tasks to establish a unified action understanding system is challenging due to the disparate actions across datasets. A large‐scale and comprehensive benchmark, namely SkatingVerse is constructed for action recognition, segmentation, proposal, and assessment. SkatingVerse focus on fine‐grained sport action, hence figure skating is chosen as the task object, which eliminates the biases of the object, scene, and space that exist in most previous datasets. In addition, skating actions have inherent complexity and similarity, which is an enormous challenge for current algorithms. A total of 1687 official figure skating competition videos was collected with a total of 184.4 h, exceeding four times over other datasets with a similar topic. SkatingVerse enables to formulate a unified task to output fine‐grained human action classification and assessment results from a raw figure skating competition video. In addition, SkatingVerse can facilitate the study of HAU foundation model due to its large scale and abundant categories. Moreover, image modality is incorporated for human pose estimation task into SkatingVerse. Extensive experimental results show that (1) SkatingVerse significantly helps the training and evaluation of HAU methods, (2) the performance of existing HAU methods has much room to improve, and SkatingVerse helps to reduce such gaps, and (3) unifying relevant tasks in HAU through a uniform dataset can facilitate more practical applications. SkatingVerse will be publicly available to facilitate further studies on relevant problems.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords