IEEE Access (Jan 2024)

Interactive Multimedia Association-Adaptive Differential Pulse Code Modulation Codec With Gated Recurrent Unit Predictor

  • Gebremichael Kibret Sheferaw,
  • Waweru Mwangi,
  • Michael W. Kimwele,
  • Adane Mamuye,
  • Salau

DOI
https://doi.org/10.1109/ACCESS.2024.3493604
Journal volume & issue
Vol. 12
pp. 165395 – 165406

Abstract

Read online

Speech coding is important for effective storage and transmission of audio signals. However, current Interactive Multimedia Association Adaptive Differential Pulse Code Modulation (IMA-ADPCM) speech coding techniques that use a fixed predictor have an impact on the encoding of dynamic and non-stationary speech signals. The limitation of the fixed predictor in IMA-ADPCM speech coding is the motivation for this study. Our goal is to improve the fixed predictor by integrating a GRU predictor that can adapt to and make better predictions of dynamic speech signals. We evaluated the performance of the IMA-ADPCM encoding baseline and the GRU predictor embedded with the IMA-ADPCM codec algorithm. The proposed pre-trained GRU predictor based encoding system outperformed the maximum Signal-to-Noise Ratio (SNR) (43.2 dB and MOS scores 3.8 to 4.3) of 5.0, and our results demonstrated considerable improvements in audio quality. The main contribution of this study is the development of a GRU Predictor that integrates IMA-ADPCM coding algorithms according to the IMA-ADPCM output speech sample and the actual PCM speech sample dataset required. By integrating the GRU predictor model in accordance with these data samples, the newly designed algorithm significantly improved the quality of the IMA-ADPCM speech codec.

Keywords