Journal of Imaging (Aug 2024)
FineTea: A Novel Fine-Grained Action Recognition Video Dataset for Tea Ceremony Actions
Abstract
Methods based on deep learning have achieved great success in the field of video action recognition. When these methods are applied to real-world scenarios that require fine-grained analysis of actions, such as being tested on a tea ceremony, limitations may arise. To promote the development of fine-grained action recognition, a fine-grained video action dataset is constructed by collecting videos of tea ceremony actions. This dataset includes 2745 video clips. By using a hierarchical fine-grained action classification approach, these clips are divided into 9 basic action classes and 31 fine-grained action subclasses. To better establish a fine-grained temporal model for tea ceremony actions, a method named TSM-ConvNeXt is proposed that integrates a TSM into the high-performance convolutional neural network ConvNeXt. Compared to a baseline method using ResNet50, the experimental performance of TSM-ConvNeXt is improved by 7.31%. Furthermore, compared with the state-of-the-art methods for action recognition on the FineTea and Diving48 datasets, the proposed approach achieves the best experimental results. The FineTea dataset is publicly available.
Keywords