Applied Sciences (Sep 2021)
Comparing Human Activity Recognition Models Based on Complexity and Resource Usage
Abstract
Human Activity Recognition (HAR) is a field with many contrasting application domains, from medical applications to ambient assisted living and sports applications. With ever-changing use cases and devices also comes a need for newer and better HAR approaches. Machine learning has long been one of the predominant techniques to recognize activities from extracted features. With the advent of deep learning techniques that push state of the art results in many different domains like natural language processing or computer vision, researchers have also started to build deep neural nets for HAR. With this increase in complexity, there also comes a necessity to compare the newer approaches to the previous state of the art algorithms. Not everything that is new is also better. Therefore, this paper aims to compare typical machine learning models like a Random Forest (RF) or a Support Vector Machine (SVM) to two commonly used deep neural net architectures, Convolutional Neural Nets (CNNs) and Recurrent Neural Nets (RNNs). Not only in regards to performance but also in regards to the complexity of the models. We measure complexity as the memory consumption, the mean prediction time and the number of trainable parameters of the models. To achieve comparable results, the models are all tested on the same publicly available dataset, the UCI HAR Smartphone dataset. With this combination of prediction performance and model complexity, we look for the models achieving the best possible performance/complexity tradeoff and therefore being the most favourable to be used in an application. According to our findings, the best model for a strictly memory limited use case is the Random Forest with an F1-Score of 88.34%, memory consumption of only 0.1 MB and mean prediction time of 0.22 ms. The overall best model in terms of complexity and performance is the SVM with a linear kernel with an F1-Score of 95.62%, memory consumption of 2 MB and a mean prediction time of 0.47 ms. The two deep neural nets are on par in terms of performance, but their increased complexity makes them less favourable to be used.
Keywords