Кібернетика та комп'ютерні технології (Mar 2025)
Chirplet Analysis of Speech Signals Based on the Hilbert–Huang Transform
Abstract
Introduction. This article proposes a novel approach to speech signal analysis based on the chirplet transform, which integrates the Hilbert – Huang transform with chirplet analysis. This method provides enhanced segmentation and feature extraction capabilities, enabling accurate identification of time-frequency characteristics in speech signals. It is proposed to overcome the limitations of traditional methods such as Short-Time Fourier transform and wavelet analysis, by offering a more adaptive solution tailored to the non-linear and non-stationary nature of speech signals. The purpose of the work is to develop a numerical-analytic method for phonetic analysis of speech signals. The central feature of the methodology is the combination of empirical mode decomposition from Hilbert – Huang transform with chirplet projections onto alternative nonlinear scales, such as the mel-scale. This approach ensures superior localization of dynamic changes in the frequency-time domain, while ensures superior with the perceptual characteristics of human hearing. By leveraging chirplet transforms, the proposed method enhances the detection of linguistic elements, including phonemes and other speech segments, even in the presence of overlapping components. Results. The practical implementation of this method is demonstrated through experimental analysis of speech signals. The results indicate an improvement in the accuracy of segmentation and noise suppression compared to conventional approaches. Time-frequency visualizations illustrate the adaptability of the method in handling complex speech signals with varying dynamic properties. Conclusions. This research contributes to advancements in speech analysis, recognition, and audio signal processing, offering potential applications in areas such as voice-controlled systems, linguistic studies, and speech recognition technologies. The proposed approach can be further refined and integrated with machine learning algorithms to automate the classification and analysis of speech segments. The article provides a foundation for future studies on the intersection of chirplet transforms and nonlinear signal processing, emphasizing their role in addressing real-world challenges in speech and audio technologies.
Keywords