IEEE Access (Jan 2023)

A Pitch Estimation Algorithm for Speech in Complex Noise Environments Based on the Radon Transform

  • Bai Li,
  • Xianwu Zhang

DOI
https://doi.org/10.1109/ACCESS.2023.3240181
Journal volume & issue
Vol. 11
pp. 9876 – 9889

Abstract

Read online

The pitch period as an essential feature is used in various speech-related works. Most actual projects collect speech signals in complex noise environments. Thus, the noise resistance of the algorithm for accurate pitch estimation has become more critical than ever. However, many state-of-the-art algorithms fail to obtain good results when dealing with noisy speech files at a low signal-to-noise ratio (SNR) value. This study presents a new noise-resistant pitch estimation algorithm based on the Radon transform and reduces the influence of formants with the modification of the classical equation. In addition, we use the difference between the pitch candidates of the consecutive frames as part of the criterion for the decoding of the Viterbi algorithm to strengthen the correlation of the pitch estimates and make the pitch contours smoother. We synthesized three noisy speech databases with 18 types of collected environmental noise and compared our algorithm with 7 state-of-the-art algorithms. The proposed algorithm has the best performance on CSTR and self-recorded databases and reduces Gross Pitch Error (GPE) rate by over 12% at 0 dB SNR against Bayesian Pitch Tracker. In particular, the GPE rate of our proposed algorithm can be maintained under 25% at 0 dB SNR, while BaNa only achieves 35%.

Keywords