Emotion Classification Based on Pulsatile Images Extracted from Short Facial Videos via Deep Learning

Shlomi Talala; Shaul Shvimmer; Rotem Simhon; Michael Gilead; Yitzhak Yitzhaky

doi:10.3390/s24082620

Sensors (Apr 2024)

Emotion Classification Based on Pulsatile Images Extracted from Short Facial Videos via Deep Learning

Shlomi Talala,
Shaul Shvimmer,
Rotem Simhon,
Michael Gilead,
Yitzhak Yitzhaky

Affiliations

Shlomi Talala: Department of Electro-Optics and Photonics Engineering, School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel
Shaul Shvimmer: Department of Electro-Optics and Photonics Engineering, School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel
Rotem Simhon: School of Psychology, Tel Aviv University, Tel Aviv 39040, Israel
Michael Gilead: School of Psychology, Tel Aviv University, Tel Aviv 39040, Israel
Yitzhak Yitzhaky: Department of Electro-Optics and Photonics Engineering, School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel

DOI: https://doi.org/10.3390/s24082620
Journal volume & issue: Vol. 24, no. 8
p. 2620

Abstract

Read online

Most human emotion recognition methods largely depend on classifying stereotypical facial expressions that represent emotions. However, such facial expressions do not necessarily correspond to actual emotional states and may correspond to communicative intentions. In other cases, emotions are hidden, cannot be expressed, or may have lower arousal manifested by less pronounced facial expressions, as may occur during passive video viewing. This study improves an emotion classification approach developed in a previous study, which classifies emotions remotely without relying on stereotypical facial expressions or contact-based methods, using short facial video data. In this approach, we desire to remotely sense transdermal cardiovascular spatiotemporal facial patterns associated with different emotional states and analyze this data via machine learning. In this paper, we propose several improvements, which include a better remote heart rate estimation via a preliminary skin segmentation, improvement of the heartbeat peaks and troughs detection process, and obtaining a better emotion classification accuracy by employing an appropriate deep learning classifier using an RGB camera input only with data. We used the dataset obtained in the previous study, which contains facial videos of 110 participants who passively viewed 150 short videos that elicited the following five emotion types: amusement, disgust, fear, sexual arousal, and no emotion, while three cameras with different wavelength sensitivities (visible spectrum, near-infrared, and longwave infrared) recorded them simultaneously. From the short facial videos, we extracted unique high-resolution spatiotemporal, physiologically affected features and examined them as input features with different deep-learning approaches. An EfficientNet-B0 model type was able to classify participants’ emotional states with an overall average accuracy of 47.36% using a single input spatiotemporal feature map obtained from a regular RGB camera.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords