e-Prime: Advances in Electrical Engineering, Electronics and Energy (Sep 2024)

Identification of psychological stress from speech signal using deep learning algorithm

  • Ankit Kumar,
  • Mohd Akbar Shaun,
  • Brijesh Kumar Chaurasia

Journal volume & issue
Vol. 9
p. 100707

Abstract

Read online

Psychological activities have various dimensions in which they correlate with their respective behavior generated by the human body. Understanding the relationship of psychological events with the help of external action units is one of the research subjects to explore various human behavior and their dependencies. Existing work applied various deep learning algorithms to outline the correlation between psychological activities with human emotions. The study of psychological analysis in the medical field is very time-consuming and costly. It requires constant monitoring of the patient for a period of time and various interrogation sessions to finalize the emotional severity of an individual. Few skilled specialists and the lack of medical knowledge of human emotions drive the need for computer vision approaches to emphasize emotion recognition, particularly in disorders. The proposed study specifically assesses the use of speech signals to identify psychological disorders in terms of stress characteristics and uses a deep learning model that incorporates feed-forward networks and long short-term memory (LSTM) to identify the degree of psychological disease. The study made use of a standard speech dataset that was gathered from a variety of patients using a standard questionnaire format. The survey was conducted throughout a few Indian states. Speech samples were taken from patients whose cortisol levels were higher than 10 %. To assess the relationship between speech and psychological activity, speech signals from each patient have been gathered. The spectrogram of the speech signal's Mel filter bank coefficients has been analyzed, and the characteristics that cause stress and those that don't have it have been further divided into categories. The suggested model classifies stress and non-stress features in 150 voice dataset subjects with an average accuracy of 98 %. The model is found to be robust for various applications such as preventing suicidal cases, improving decision-making in the diagnosis of depression patients, improves the overall mental healthcare system.

Keywords