Advances in Civil Engineering (Jan 2020)
A Deep Learning-Based Approach to Enable Action Recognition for Construction Equipment
Abstract
In order to support smart construction, digital twin has been a well-recognized concept for virtually representing the physical facility. It is equally important to recognize human actions and the movement of construction equipment in virtual construction scenes. Compared to the extensive research on human action recognition (HAR) that can be applied to identify construction workers, research in the field of construction equipment action recognition (CEAR) is very limited, mainly due to the lack of available datasets with videos showing the actions of construction equipment. The contributions of this research are as follows: (1) the development of a comprehensive video dataset of 2,064 clips with five action types for excavators and dump trucks; (2) a new deep learning-based CEAR approach (known as a simplified temporal convolutional network or STCN) that combines a convolutional neural network (CNN) with long short-term memory (LSTM, an artificial recurrent neural network), where CNN is used to extract image features and LSTM is used to extract temporal features from video frame sequences; and (3) the comparison between this proposed new approach and a similar CEAR method and two of the best-performing HAR approaches, namely, three-dimensional (3D) convolutional networks (ConvNets) and two-stream ConvNets, to evaluate the performance of STCN and investigate the possibility of directly transferring HAR approaches to the field of CEAR.