IEEE Access (Jan 2024)

Machine Interpretation of Ballet Dance: Alternating Wavelet Spatial and Channel Attention Based Learning Model

  • P. V. V. Kishore,
  • D. Anil Kumar,
  • P. Praveen Kumar,
  • D. Srihari,
  • N. Sasikala,
  • L. Divyasree

DOI
https://doi.org/10.1109/ACCESS.2024.3390004
Journal volume & issue
Vol. 12
pp. 55264 – 55280

Abstract

Read online

‘Ballet’ is a $15^{th}$ - century concert performing dance form that originated in Italy. Current AI models for ballet dance pose identification in live performance videos is challenging due to variational pixel distribution of human actions across backgrounds. Notably, their performance on online video datasets improved with both channel (CA) and spatial attention (SA) models but tend to generate over-smoothed Convolutional features due to feature averaging in the attention network. Alternatively, wavelet attention preserves both high and low frequency components in the features which improves the test accuracy. Applying CA and SA on wavelet features simultaneously resulted in hyper-refined features due to double averaging. To overcome this drawback, Alternating Wavelet Channel and Spatial Attention (AWCSA) across any learning network as backbone architecture is proposed. The global features across the residual connections in the backbone (ResNet50) are amplified exclusively with low and high-frequency local features across the channel and spatial dimensions alternatively one after the other. The Ballet online dance video dataset (BOVD23) evaluates the performance of the proposed AWCSA along with baseline action datasets. The end-to-end trained AWCSA has recorded a 6-8% higher performance metrics on BOVD23 dataset over the counterparts.

Keywords