Current Directions in Biomedical Engineering (Sep 2022)
Network Architecture Influence on Facial Emotion Recognition
Abstract
Artificial Intelligence has been blending into daily life by means of many useful applications from voice command to facial recognition. One therapeutic application to be supported by AI solutions is treatment of people with Autism Spectrum Disorder. A closed loop feedback system is planned in conjunction with a novel reward system that will encourage the user to express emotions and be rewarded for it in a virtual environment. In this work five popular neural network architectures of VGG16, ResNet50, GoogleNet, ShuffleNet and EfficientNetb0 are studied and compared, with the aim of finding a relation between accuracy and developed features based on the architecture, for the application in Facial Emotion Recognition (FER). Three datasets were used, the OULU-CASIA for training and validation, alongside FACES and JAFFE for robustness analysis. The images were first preprocessed to eliminate background noise. The performance of the model was based on the true positive predictions with Grad-CAM prediction visualizations to visualize the focus of the networks in making decisions for classification. Results showed that deep network architectures with high parameter space performed best, with architecture design showing more influence on the region of focus than on classification results. This is attributed to the different layer combinations as well as parameters used for feature extraction. Shallow depth networks with high parameter space performed better than deep networks with low parameter space for FER application.
Keywords