IEEE Access (Jan 2024)
A Study on the Implementation of Temporal Noise-Robust Methods for Acquiring Vital Signs
Abstract
There has been a surge in research focused on the analysis of vital signs using remote photoplethysmography (rPPG) sensors, as opposed to traditional photoplethysmography (PPG) methods. Unlike PPG, rPPG imposes no spatial constraints and employs a straightforward measurement technique, making it increasingly prevalent. Its integration into image processing, harnesses the remarkable advances in artificial-intelligence technology, achieving accuracy that is comparable to that of traditional PPG sensors. In prior studies, obtaining vital signs often necessitated an unnecessary and procedural fixation of facial positions within frames to enhance predictive accuracy. Despite such fixation, achieving notably high accuracy remained elusive. Here, we introduce a simple yet robust approach utilizing videos captured by an rPPG sensor, ensuring both high accuracy and resilience to noise. We propose a convolutional neural network model meticulously designed to resist interference from noise data that may arise in the initial stages, coupled with effective preprocessing techniques to attain superior predictive accuracy. Data extracted by a facial extractor undergoes preprocessing via normalization. Leveraging the Temporal Shift Module (TSM), this normalization efficiently captures temporal relationships without incurring additional computational overhead. Mitigating noise signal interference from non-facial data through the use of multiple attention masks and augmenting prediction accuracy via skip connections. Moreover, we compile a specially tailored dataset for pulse rate and breath rate data, catering specifically to the East Asian population. The proposed process demonstrates outstanding performance in predicting both pulse rate and breath rate.
Keywords