HARNet in deep learning approach—a systematic survey

Neelam Sanjeev Kumar; G. Deepika; V. Goutham; B. Buvaneswari; R. Vijaya Kumar Reddy; Sanjeevkumar Angadi; C. Dhanamjayulu; Ravikumar Chinthaginjala; Faruq Mohammad; Baseem Khan

doi:10.1038/s41598-024-58074-y

Scientific Reports (Apr 2024)

HARNet in deep learning approach—a systematic survey

Neelam Sanjeev Kumar,
G. Deepika,
V. Goutham,
B. Buvaneswari,
R. Vijaya Kumar Reddy,
Sanjeevkumar Angadi,
C. Dhanamjayulu,
Ravikumar Chinthaginjala,
Faruq Mohammad,
Baseem Khan

Affiliations

Neelam Sanjeev Kumar: Department of Computer Science and Engineering, SRM Institute of Science and Technology
G. Deepika: Department of Electronics and Communication Engineering, St. Peter’s Engineering College
V. Goutham: Department of Computer Science and Engineering, St Mary’s Group of Institutions
B. Buvaneswari: Department of Information Technology, Panimalar Engineering College
R. Vijaya Kumar Reddy: Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation
Sanjeevkumar Angadi: Department of Computer Science and Engineering, Nutan College of Engineering and Research
C. Dhanamjayulu: School of Electrical Engineering, Vellore Institute of Technology
Ravikumar Chinthaginjala: School of Electronics Engineering, Vellore Institute of Technology
Faruq Mohammad: Department of Chemistry, College of Science, King Saud University
Baseem Khan: Department of Electrical and Computer Engineering, Hawassa University

DOI: https://doi.org/10.1038/s41598-024-58074-y
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract A comprehensive examination of human action recognition (HAR) methodologies situated at the convergence of deep learning and computer vision is the subject of this article. We examine the progression from handcrafted feature-based approaches to end-to-end learning, with a particular focus on the significance of large-scale datasets. By classifying research paradigms, such as temporal modelling and spatial features, our proposed taxonomy illuminates the merits and drawbacks of each. We specifically present HARNet, an architecture for Multi-Model Deep Learning that integrates recurrent and convolutional neural networks while utilizing attention mechanisms to improve accuracy and robustness. The VideoMAE v2 method ( https://github.com/OpenGVLab/VideoMAEv2 ) has been utilized as a case study to illustrate practical implementations and obstacles. For researchers and practitioners interested in gaining a comprehensive understanding of the most recent advancements in HAR as they relate to computer vision and deep learning, this survey is an invaluable resource.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords