Video Classification of Cloth Simulations: Deep Learning and Position-Based Dynamics for Stiffness Prediction

Makara Mao; Hongly Va; Min Hong

doi:10.3390/s24020549

Sensors (Jan 2024)

Video Classification of Cloth Simulations: Deep Learning and Position-Based Dynamics for Stiffness Prediction

Makara Mao,
Hongly Va,
Min Hong

Affiliations

Makara Mao: Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea
Hongly Va: Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea
Min Hong: Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea

DOI: https://doi.org/10.3390/s24020549
Journal volume & issue: Vol. 24, no. 2
p. 549

Abstract

Read online

In virtual reality, augmented reality, or animation, the goal is to represent the movement of deformable objects in the real world as similar as possible in the virtual world. Therefore, this paper proposed a method to automatically extract cloth stiffness values from video scenes, and then they are applied as material properties for virtual cloth simulation. We propose the use of deep learning (DL) models to tackle this issue. The Transformer model, in combination with pre-trained architectures like DenseNet121, ResNet50, VGG16, and VGG19, stands as a leading choice for video classification tasks. Position-Based Dynamics (PBD) is a computational framework widely used in computer graphics and physics-based simulations for deformable entities, notably cloth. It provides an inherently stable and efficient way to replicate complex dynamic behaviors, such as folding, stretching, and collision interactions. Our proposed model characterizes virtual cloth based on softness-to-stiffness labels and accurately categorizes videos using this labeling. The cloth movement dataset utilized in this research is derived from a meticulously designed stiffness-oriented cloth simulation. Our experimental assessment encompasses an extensive dataset of 3840 videos, contributing to a multi-label video classification dataset. Our results demonstrate that our proposed model achieves an impressive average accuracy of 99.50%. These accuracies significantly outperform alternative models such as RNN, GRU, LSTM, and Transformer.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords