Frontiers in Physics (Dec 2024)
Research on multi-scenario adaptive acoustic encoders based on neural architecture search
Abstract
This paper presents the Scene Adaptive Acoustic Encoder (SAAE) method, which is tailored to diverse acoustic environments for adaptive design. Hand-crafted acoustic encoders often struggle to adapt to varying acoustic conditions, resulting in performance degradation in end-to-end speech recognition tasks. To address this challenge, the proposed SAAE method learns the differences in acoustic features across different environments and accordingly designs suitable acoustic encoders. By incorporating neural architecture search technology, the effectiveness of the encoder design is enhanced, leading to improved speech recognition performance. Experimental evaluations on three commonly used Mandarin and English datasets (Aishell-1, HKUST, and SWBD) demonstrate the effectiveness of the proposed method. The SAAE method achieves an average error rate reduction of more than 5% compared with existing acoustic encoders, highlighting its capability to deeply analyze speech features in specific scenarios and design high-performance acoustic encoders in a targeted manner.
Keywords