Serbian Journal of Electrical Engineering (Jan 2022)

Hilbert spectrum based features for speech/music classification

  • Kumar Arvind,
  • Solanki Sandeep Singh,
  • Chandra Mahesh

DOI
https://doi.org/10.2298/SJEE2202239K
Journal volume & issue
Vol. 19, no. 2
pp. 239 – 259

Abstract

Read online

Automatic Speech/Music classification uses different signal processing techniques to categorize multimedia content into different classes. The proposed work explores Hilbert Spectrum (HS) obtained from different AM-FM components of an audio signal, also called Intrinsic Mode Functions (IMFs) to classify an incoming audio signal into speech/music signal. The HS is a twodimensional representation of instantaneous energies (IE) and instantaneous frequencies (IF) obtained using Hilbert Transform of the IMFs. This HS is further processed using Mel-filter bank and Discrete Cosine Transform (DCT) to generate novel IF and Instantaneous Amplitude (IA) based cepstral features. Validations of the results were done using three databases-Slaney Database, GTZAN and MUSAN database. To evaluate the general applicability of the proposed features, extensive experiments were conducted on different combination of audio files from S&S, GTZAN and MUSAN database and promising results are achieved. Finally, performance of the system is compared with performance of existing cepstral features and previous works in this domain.

Keywords