A Study on the Implementation of Temporal Noise-Robust Methods for Acquiring Vital Signs

Seongchan Park; Heejun Youn; Seunghyun Lee; Soonchul Kwon

doi:10.1109/ACCESS.2024.3365055

IEEE Access (Jan 2024)

A Study on the Implementation of Temporal Noise-Robust Methods for Acquiring Vital Signs

Seongchan Park,
Heejun Youn,
Seunghyun Lee,
Soonchul Kwon

Affiliations

Seongchan Park: ORCiD; Department of Plasma Bio Display, Kwangwoon University, Seoul, South Korea
Heejun Youn: ORCiD; Department of Plasma Bio Display, Kwangwoon University, Seoul, South Korea
Seunghyun Lee: ORCiD; Department of Ingenium College Liberal Arts, Kwangwoon University, Seoul, South Korea
Soonchul Kwon: ORCiD; Department of Interdisciplinary Information System, Graduate School of Smart Convergence, Kwangwoon University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3365055
Journal volume & issue: Vol. 12
pp. 24700 – 24713

Abstract

Read online

There has been a surge in research focused on the analysis of vital signs using remote photoplethysmography (rPPG) sensors, as opposed to traditional photoplethysmography (PPG) methods. Unlike PPG, rPPG imposes no spatial constraints and employs a straightforward measurement technique, making it increasingly prevalent. Its integration into image processing, harnesses the remarkable advances in artificial-intelligence technology, achieving accuracy that is comparable to that of traditional PPG sensors. In prior studies, obtaining vital signs often necessitated an unnecessary and procedural fixation of facial positions within frames to enhance predictive accuracy. Despite such fixation, achieving notably high accuracy remained elusive. Here, we introduce a simple yet robust approach utilizing videos captured by an rPPG sensor, ensuring both high accuracy and resilience to noise. We propose a convolutional neural network model meticulously designed to resist interference from noise data that may arise in the initial stages, coupled with effective preprocessing techniques to attain superior predictive accuracy. Data extracted by a facial extractor undergoes preprocessing via normalization. Leveraging the Temporal Shift Module (TSM), this normalization efficiently captures temporal relationships without incurring additional computational overhead. Mitigating noise signal interference from non-facial data through the use of multiple attention masks and augmenting prediction accuracy via skip connections. Moreover, we compile a specially tailored dataset for pulse rate and breath rate data, catering specifically to the East Asian population. The proposed process demonstrates outstanding performance in predicting both pulse rate and breath rate.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords