TCEDN: A Lightweight Time-Context Enhanced Depression Detection Network

Keshan Yan; Shengfa Miao; Xin Jin; Yongkang Mu; Hongfeng Zheng; Yuling Tian; Puming Wang; Qian Yu; Da Hu

doi:10.3390/life14101313

Life (Oct 2024)

TCEDN: A Lightweight Time-Context Enhanced Depression Detection Network

Keshan Yan,
Shengfa Miao,
Xin Jin,
Yongkang Mu,
Hongfeng Zheng,
Yuling Tian,
Puming Wang,
Qian Yu,
Da Hu

Affiliations

Keshan Yan: School of Software, Yunnan University, Kunming 650000, China
Shengfa Miao: School of Software, Yunnan University, Kunming 650000, China
Xin Jin: School of Software, Yunnan University, Kunming 650000, China
Yongkang Mu: School of Software, Yunnan University, Kunming 650000, China
Hongfeng Zheng: School of Software, Yunnan University, Kunming 650000, China
Yuling Tian: School of Software, Yunnan University, Kunming 650000, China
Puming Wang: School of Software, Yunnan University, Kunming 650000, China
Qian Yu: School of Software, Yunnan University, Kunming 650000, China
Da Hu: Fengtu Technology (Shenzhen) Co., Ltd., Shenzhen 518057, China

DOI: https://doi.org/10.3390/life14101313
Journal volume & issue: Vol. 14, no. 10
p. 1313

Abstract

Read online

The automatic video recognition of depression is becoming increasingly important in clinical applications. However, traditional depression recognition models still face challenges in practical applications, such as high computational costs, the poor application effectiveness of facial movement features, and spatial feature degradation due to model stitching. To overcome these challenges, this work proposes a lightweight Time-Context Enhanced Depression Detection Network (TCEDN). We first use attention-weighted blocks to aggregate and enhance video frame-level features, easing the model’s computational workload. Next, by integrating the temporal and spatial changes of video raw features and facial movement features in a self-learning weight manner, we enhance the precision of depression detection. Finally, a fusion network of 3-Dimensional Convolutional Neural Network (3D-CNN) and Convolutional Long Short-Term Memory Network (ConvLSTM) is constructed to minimize spatial feature loss by avoiding feature flattening and to achieve depression score prediction. Tests on the AVEC2013 and AVEC2014 datasets reveal that our approach yields results on par with state-of-the-art techniques for detecting depression using video analysis. Additionally, our method has significantly lower computational complexity than mainstream methods.

Published in Life

ISSN: 2075-1729 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/life

About the journal

Abstract

Keywords