IET Computer Vision (Oct 2017)

Human interaction recognition fusing multiple features of depth sequences

  • Jianjun Li,
  • Xia Mao,
  • Lijiang Chen,
  • Lan Wang

DOI
https://doi.org/10.1049/iet-cvi.2017.0025
Journal volume & issue
Vol. 11, no. 7
pp. 560 – 566

Abstract

Read online

Human interaction recognition has played a major role in building intelligent video surveillance systems. Recently, depth data captured by the emerging RGB‐D sensors began to show its importability in human interaction recognition. This study proposes a novel framework for human interaction recognition using depth information including an algorithm to reconstruct depth sequence with as few key frames as possible. The proposed framework includes two essential modules. First, key frames extraction by sparse constraint, then the fusion multi‐feature, is constructed by using two types of available features and Max‐pooling, respectively. Finally, multiple features are directly sent to the SVM for the recognition of the human activity. This study explores the static and dynamic feature fusion method to improve the recognition performance with contextual relevance of continuous frames. A weight is used to fuse shape and optical flow features, which not only enhance the description capability of human behavioural characteristics in the spatiotemporal domain, but also effectively reduces the adverse impact of certain distortion point of interest for target recognition. Experimental results show that the proposed approach yields considerable performance improvement over the state‐of‐the‐art approaches with respect to accuracy on a public action dataset.

Keywords