Taiyuan Ligong Daxue xuebao (Jul 2024)

Speech Enhancement Based on Multi-Task Adaptive Knowledge Distillation

  • ZHANG Gangmin,
  • LI Yarong,
  • JIA Hairong,
  • WANG Xianxia,
  • DUAN Shufei

DOI
https://doi.org/10.16355/j.tyut.1007-9432.20230259
Journal volume & issue
Vol. 55, no. 4
pp. 720 – 726

Abstract

Read online

Purposes In order to solve the computational cost problem of complex model in time and hardware, and improve the performance of speech enhancement algorithm, a speech enhancement algorithm using multi-task adaptive knowledge distillation is proposed. Methods First, the idea of knowledge distillation is adopted to solve the problems that the existing speech enhancement model is too large, has many parameters, and has high calculation cost. Second, the differences between different time-frequency units are fully considered, and the weighting factor is introduced to optimize the traditional loss function to improve the network performance of students. In order to avoid the uncertainty of teacher network prediction affecting the performance of student network, the knowledge distillation network of multi-task adaptive learning is built to better utilize the correlation between different tasks to optimize the model. Findings The simulation results show that the proposed algorithm can effectively improve the performance of speech enhancement model while reducing the number of parameters and shortening the calculation time.

Keywords