Модели, системы, сети в экономике, технике, природе и обществе (Apr 2022)

SPEECH/PAUSE SEGMENTATION METHOD BASED ON TEAGER ENERGY OPERATOR

  • A.K. Alimuradov

DOI
https://doi.org/10.21685/2227-8486-2021-4-5
Journal volume & issue
no. 4

Abstract

Read online

Background. Speech segmentation into voiced, unvoiced sections and pauses is the key task for the majority of speech applications. This is especially important in systems for assessing human psycho-emotional state by speech, since duration of voiced, unvoiced sections and pauses are informative parameters being relevant to naturally expressed human emotions. Materials and methods. The second-order differential Teager energy operator was used, which has a good amplitude that is highly susceptible to changes in signal amplitude and frequency. The method is implemented by means of the program © Matlab (MathWorks). Results. There has been developed a method for speech/pause segmentation to linearly divide a speech signal into fragments, to calculate the energy characteristic using the Teager energy operator, to calculate the values of short-term energy, and determine the «speech/pause» status of fragments based on the calculated threshold values of the short-term energy. There has been carried out a research on the developed method to assess the effectiveness of speech/pause segmentation over the classical method based on the analysis of short-term energy, has been carried out. Conclusions. In accordance with the obtained research results, there is an increase in the efficiency of speech/pause segmentation by 5.26 % and 5.51 % for the 1st and 2nd kind errors, respectively. The proposed speech/pause segmentation method can be effectively tested in systems for assessing human psycho-emotional state due to its good susceptibility to sudden changes in signal amplitude and frequency with unstable vocal motor skills.

Keywords