Recognition of Audio Depression Based on Convolutional Neural Network and Generative Antagonism Network Model

Zhiyong Wang; Longxi Chen; Lifeng Wang; Guangqiang Diao

doi:10.1109/ACCESS.2020.2998532

IEEE Access (Jan 2020)

Recognition of Audio Depression Based on Convolutional Neural Network and Generative Antagonism Network Model

Zhiyong Wang,
Longxi Chen,
Lifeng Wang,
Guangqiang Diao

Affiliations

Zhiyong Wang: ORCiD; School of Information Engineering, Shandong Youth University of Political Science, Jinan, China
Longxi Chen: ORCiD; School of Information Engineering, Shandong Youth University of Political Science, Jinan, China
Lifeng Wang: ORCiD; School of Information Engineering, Shandong Youth University of Political Science, Jinan, China
Guangqiang Diao: ORCiD; School of Information Engineering, Shandong Youth University of Political Science, Jinan, China

DOI: https://doi.org/10.1109/ACCESS.2020.2998532
Journal volume & issue: Vol. 8
pp. 101181 – 101191

Abstract

Read online

This paper proposes an audio depression recognition method based on convolution neural network and generative antagonism network model. First of all, preprocess the data set, remove the long-term mute segments in the data set, and splice the rest into a new audio file. Then, the features of speech signal, such as Mel-scale Frequency Cepstral Coefficients (MFCCs), short-term energy and spectral entropy, are extracted based on audio difference normalization algorithm. The extracted matrix vector feature data, which represents the unique attributes of the subjects' own voice, is the data base for model training. Then, based on the combination of CNN and GAN, DR AudioNet is used to build the model of depression recognition research. With the help of DR AudioNet, the former model is optimized and the recognition classification is completed through the normalization characteristics of the two adjacent segments before and after the current audio segment. The experimental results on AViD-Corpus and DAIC-WOZ datasets show that the proposed method effectively reduces the depression recognition error compared with other existing methods, and the RMSE and MAE values obtained on the two datasets are better than the comparison algorithm by more than 5%.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords