Machine Interpretation of Ballet Dance: Alternating Wavelet Spatial and Channel Attention Based Learning Model

P. V. V. Kishore; D. Anil Kumar; P. Praveen Kumar; D. Srihari; N. Sasikala; L. Divyasree

doi:10.1109/ACCESS.2024.3390004

IEEE Access (Jan 2024)

Machine Interpretation of Ballet Dance: Alternating Wavelet Spatial and Channel Attention Based Learning Model

P. V. V. Kishore,
D. Anil Kumar,
P. Praveen Kumar,
D. Srihari,
N. Sasikala,
L. Divyasree

Affiliations

P. V. V. Kishore: ORCiD; Department of Electronics and Communication Engineering, Biomechanics and Vision Computing Research Center, Koneru Lakshmaiah Education Foundation (Deemed to be University), Guntur, India
D. Anil Kumar: Department of Electronics and Communication Engineering, PACE Institute of Technology and Sciences, Ongole, India
P. Praveen Kumar: Department of AI and DS, Koneru Lakshmaiah Education Foundation (Deemed to be University), Guntur, India
D. Srihari: Department of Electronics and Communication Engineering, Sri Venkateswara College of Engineering and Technology, Chittoor, India
N. Sasikala: Department of Electronics and Communication Engineering, Kamala Institute of Technology and Science, Warangal, India
L. Divyasree: Department of Electronics and Communication Engineering, Koneru Lakshmaiah Education Foundation (Deemed to be University), Guntur, India

DOI: https://doi.org/10.1109/ACCESS.2024.3390004
Journal volume & issue: Vol. 12
pp. 55264 – 55280

Abstract

Read online

‘Ballet’ is a $15^{th}$ - century concert performing dance form that originated in Italy. Current AI models for ballet dance pose identification in live performance videos is challenging due to variational pixel distribution of human actions across backgrounds. Notably, their performance on online video datasets improved with both channel (CA) and spatial attention (SA) models but tend to generate over-smoothed Convolutional features due to feature averaging in the attention network. Alternatively, wavelet attention preserves both high and low frequency components in the features which improves the test accuracy. Applying CA and SA on wavelet features simultaneously resulted in hyper-refined features due to double averaging. To overcome this drawback, Alternating Wavelet Channel and Spatial Attention (AWCSA) across any learning network as backbone architecture is proposed. The global features across the residual connections in the backbone (ResNet50) are amplified exclusively with low and high-frequency local features across the channel and spatial dimensions alternatively one after the other. The Ballet online dance video dataset (BOVD23) evaluates the performance of the proposed AWCSA along with baseline action datasets. The end-to-end trained AWCSA has recorded a 6-8% higher performance metrics on BOVD23 dataset over the counterparts.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords