IEEE Access (Jan 2021)

Scalogram as a Representation of Emotional Speech

  • Pawel Powroznik,
  • Piotr Wojcicki,
  • Slawomir W. Przylucki

DOI
https://doi.org/10.1109/ACCESS.2021.3127581
Journal volume & issue
Vol. 9
pp. 154044 – 154057

Abstract

Read online

It is very hard to implement the emotion recognition system based on spoken text. Computer applications have a huge problem with understanding non-literal meaning of statements as well as irony or a situational joke. The article describes how to represent emotional speech in the form of scalograms which are the result of speech signal processing by Discrete Wavelet Transform (DTW). The method of processing scalograms in order to extract input data for natural language processing algorithms in order to recognise the emotional state is also presented. The following emotional states were considered during the research: joy, anger, boredom, sadness, fear and neutral state. The developed method has been tested on databases containing recordings of emotional speech in the following languages: Polish, English, German and Danish. Depending on the language and classifier used, obtained results ranged from over 62% to over 94%. The use of fuzzy classifiers greatly improves the time and efficiency of classification.

Keywords