Acoustics (May 2024)

Training a Filter-Based Model of the Cochlea in the Context of Pre-Trained Acoustic Models

  • Louise Coppieters de Gibson,
  • Philip N. Garner

DOI
https://doi.org/10.3390/acoustics6020025
Journal volume & issue
Vol. 6, no. 2
pp. 470 – 488

Abstract

Read online

Auditory research aims in general to lead to understanding of physiological processes. By contrast, the state of the art in automatic speech processing (notably recognition) is dominated by large pre-trained models that are meant to be used as black-boxes. In this work, we integrate a physiologically plausible (albeit simple filter-based) model of the cochlea into a much larger pre-trained acoustic model for speech recognition. We show that the hybrid system can be trained and evaluated with various combinations of fine-tuning and self-supervision. The results broadly show that the system automatically yields structures that are known to work well. Moreover, these structures lack artifacts that were apparent in (our) previous work using less sophisticated neural models. We conclude that the hybrid structure is an appropriate way to proceed in auditory research, more generally allowing the work to take advantage of larger models and databases from which it would not otherwise benefit.

Keywords