Acta Acustica (Jan 2022)

Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing

  • Rӧttges Saskia,
  • Hauth Christopher F.,
  • Rennies Jan,
  • Brand Thomas

DOI
https://doi.org/10.1051/aacus/2022009
Journal volume & issue
Vol. 6
p. 21

Abstract

Read online

We reanalyzed a study that investigated binaural and temporal integration of speech reflections with different amplitudes, delays, and interaural phase differences. We used a blind binaural speech intelligibility model (bBSIM), applying an equalization-cancellation process for modeling binaural release from masking. bBSIM is blind, as it requires only the mixed binaural speech and noise signals and no auxiliary information about the listening conditions. bBSIM was combined with two non-blind back-ends: The speech intelligibility index (SII) and the speech transmission index (STI) resulting in hybrid-models. Furthermore, bBSIM was combined with the non-intrusive short-time objective intelligibility (NI-STOI) resulting in a fully blind model. The fully non-blind reference model used in the previous study achieved the best prediction accuracy (R2 = 0.91 and RMSE = 1 dB). The fully blind model yielded a coefficient of determination (R2 = 0.87) similar to that of the reference model but also the highest root mean square error of the models tested in this study (RMSE = 4.4 dB). By adjusting the binaural processing errors of bBSIM as done in the reference model, the RMSE could be decreased to 1.9 dB. Furthermore, in this study, the dynamic range of the SII had to be adjusted to predict the low SRTs of the speech material used.

Keywords