Spatiotemporal Sensitive Network for Non-Contact Heart Rate Prediction from Facial Videos

Liying Su; Yitao Wang; Dezhao Zhai; Yuping Shi; Yinghao Ding; Guohua Gao; Qinwei Li; Ming Yu; Hang Wu

doi:10.3390/app14209551

Applied Sciences (Oct 2024)

Spatiotemporal Sensitive Network for Non-Contact Heart Rate Prediction from Facial Videos

Liying Su,
Yitao Wang,
Dezhao Zhai,
Yuping Shi,
Yinghao Ding,
Guohua Gao,
Qinwei Li,
Ming Yu,
Hang Wu

Affiliations

Liying Su: College of Mechanical & Energy Engineering, Beijing University of Technology, Beijing 100124, China
Yitao Wang: College of Mechanical & Energy Engineering, Beijing University of Technology, Beijing 100124, China
Dezhao Zhai: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin 300384, China
Yuping Shi: Tianjin Key Laboratory for Advanced Signal Processing, Civil Aviation University of China, Tianjin 300300, China
Yinghao Ding: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin 300384, China
Guohua Gao: College of Mechanical & Energy Engineering, Beijing University of Technology, Beijing 100124, China
Qinwei Li: Tianjin Key Laboratory for Advanced Signal Processing, Civil Aviation University of China, Tianjin 300300, China
Ming Yu: Systems Engineering Institute, Academy of Military Sciences, People’s Liberation Army, Tianjin 300161, China
Hang Wu: Systems Engineering Institute, Academy of Military Sciences, People’s Liberation Army, Tianjin 300161, China

DOI: https://doi.org/10.3390/app14209551
Journal volume & issue: Vol. 14, no. 20
p. 9551

Abstract

Read online

Heart rate (HR) is an important indicator reflecting the overall physical and mental health of the human body, playing a crucial role in diagnosing cardiovascular and neurological diseases. Recent research has revealed that variations in the light absorption of human skin captured through facial video over the cardiac cycle, due to changes in blood volume, can be utilized for non-contact HR estimation. However, most existing methods rely on single-modal video sources (such as RGB or NIR), which often yield suboptimal results due to noise and the limitations of a single information source. To overcome these challenges, this paper proposes a multimodal information fusion architecture named the spatiotemporal sensitive network (SS-Net) for non-contact heart rate estimation. Firstly, spatiotemporal feature maps are utilized to extract physiological signals from RGB and NIR videos effectively. Next, a spatiotemporal sensitive (SS) module is introduced to extract useful physiological signal information from both RGB and NIR spatiotemporal maps. Finally, a multi-level spatiotemporal context fusion (MLSC) module is designed to fuse and complement information between the visible light and infrared modalities. Then, different levels of fused features are refined in task-specific branches to predict both remote photoplethysmography (rPPG) signals and heart rate (HR) signals. Experiments conducted on three datasets demonstrate that the proposed SS-Net achieves superior performance compared to existing methods.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords