Tongxin xuebao (Aug 2023)

Speaker verification method based on cross-domain attentive feature fusion

  • Zhen YANG,
  • Tianlang WANG,
  • Haiyan GUO,
  • Tingting WANG

Journal volume & issue
Vol. 44
pp. 89 – 98

Abstract

Read online

Aiming at the problem that the lack of structure information among speech signal sample in the front-end acoustic features of speaker verification system, a speaker verification method based on cross-domain attentive feature fusion was proposed.Firstly, a feature extraction method based on the graph signal processing (GSP) was proposed to extract the structural information of speech signals, each sample point in a speech signal frame was regarded as a graph node to construct the speech graph signal and the graph frequency information of the speech signal was extracted through the graph Fourier transform and filter banks.Then, an attentive feature fusion network with the residual neural network and the squeeze-and- excitation block was proposed to fuse the features in the traditional time-frequency domain and those in the graph frequency domain to promote the speaker verification system performance.Finally, the experiment was carried out on the VoxCeleb, SITW, and CN-Celeb datasets.The experimental results show that the proposed method performs better than the baseline ECAPA-TDNN model in terms of equal error rate (EER) and minimum detection cost function (min-DCF).

Keywords